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PREFACE 



This report constitutes the proceedings of a three day workshop on Hypertext 
Standardization held at the National Institute of Standards and Technology (NIST) on 
January 16 - 18, 1990. The workshop was the first in what we hope becomes a series of 
standardization efforts. The workshop was sponsored by the Hypertext Competence 
Project of the National Computer Systems Laboratory of NIST. 

The workshop included plenary sessions and three disscussion groups. Because the 
participants in the workshop drew on their personal experiences, they sometimes cited 
specific vendors and commercial products. The inclusion or omission of a particular 
company or product does not imply either endorsement or criticism by NIST. 

We of the Hypertext Competence Project gratefully acknowledge the assistance of all 
those who made the workshop a success. Further, I want to thank Dave Stotts for 
designing the cover graphic. 
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ABSTRACT 



This report constitutes the proceedings of a three day woricshop on Hypertext 
Standardization held at the National Institute of Standards and Technology (NIST) on 
January 16 - 18, 1990. Efforts towards standardization of hypertext have already been 
initiated in various interested organizations. In recognition of these existing efforts, NIST 
sponsored the Hypertext Standardization Workshop organized by the Hypertext 
Competence Project of the National Computer Systems Laboratory. 

The major purpose of the Hypertext Standardization Workshop was to provide a 
forum for pre.sentalion and discussion of existing and proposed approaches to hypertext 
standardization. The stated workshop goals were to consider hypertext system definitions, 
to identify viable approaches for pursuing standards, to seek commonality among 
alternatives whenever possible, and to make progress towards a coordinated plan for 
standards development, i.e. a hypertext reference model. The workshop announcement 
solicitated contributed papers on any aspect of hypertext standardization, including 
assertions that standardization is premature or inadvisable. Approximately 30 
contiibutions were received and distributed to the 65 workshop participants on the first 
day. 

Ihe workshop included plenary sessions and three discussion groups. This 
proceedings includes the papers selected for presentation in plenary sessions, reports of 
the discussion groups, and supplementary materials. Major conclusions of the workshop 
were that the discussion groups should continue their technical efforts, and that NIST 
should sponsor at least one more workshop to provide a forum for public discussion of 
progress. 

Key words: hypermedia; hypertext; standards. 
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INTRODUCTION 



Over the past several years we have seen a sifnihcant increase in the avaUabUity of 
document and information management systems that call themselves Hypertext or 
Hypermedia implementations. These systems have received a degree of acceptance from 
the user community and are being integrated into an increasing number of application 
development projects. There is every reason to believe that this trend wUl contmue to 
grow and influence the marketplace in the foreseeable future. 

Although, at present, Hypertext/Hypennedia systems have no agreed foimal 
definition, ^»^'-re is agreement on some of the underlying concepts that characterize them. 
Recently, a number of authors have stated requirements for hypertext standards i«id some 
have offered definitions and initial specifications for consideration. In several cases 
specialized standardization efforts have already been initiated thiough interested 
organizations. In recognition of this emerging activity, the National Institute of Standards 
and Technology (NIST) sponsored the Hypertext Standardization Workshop. One 
consideration of the workshop was to determine if the evolution of Hypertext and 
Hypermedia technologies has reached the point where it makes sense to consider fomial 
standardization. 

The major purpose of the Hypertext Standardization Workshop was to provide a 
fomm for presentation and discussion of existing and proposed approaches to hypertext 
standardization. We solicitated contributed papers on any aspect of hypertext 
standardization, including assertions that standardization is premature or madvisable. We 
received approximately 30 contributions totaling more than 400 pages, which were 
distributed to all workshop participants on the first day. The stated workshop goals were 
to consider hypertext system definitions, to identify viable approaches for pursuing 
standards, to seek commonality among alternatives whenever possible, and to make 
progress towards a coordinated plan for standards development, i.e., a hypertext reference 
model. 

Of the contributed papers, those of particularly high quality and general interest were 
accepted for publication and featured during a plenary session on the opening day of the 
workshop. Each author was given approximately 25 minutes to present a particular point 
of view These individual papers are presented alphabetically in this proceedings, fhe 
remainder of the first day and all of the second day consisted of discussion groups set up 
in response to issues raised in the contributed papers. 

Three discussion groups met in parallel on the topics of Hypertext Models. Hypertext 
Data Interchange, and Hypertext User Requirements. Each group chose one or more 
"Presentors" to convey group opinions to the whole workshop. Summanes of the 
deliberations and conclusions of these discussion groups, authored by the presentors, are 
included herein. 



The morning of the third day of the workshop consisted of reports from each of the 
three discussion groups and a general discussion of where to go from here. In general, the 
groups were quite pleased with their progress and expressed a desiie to meet on a 
somewhat regular basis to continue deliberations. There was general agreement that a 
recognized hypertext/hypermedia standards group could function as the focal point in 
defining a hypertext data model and a reference model that addresses other more 
sp<;cialized activities in areas such as documents, graphics, video, and sound. 

Craig Thompson raised the issue of establishing a more formal hypertext/hypermedia 
"study group" with regular scheduled meetings and operating procedures. Possibilities for 
organizing such a group under the auspices of ACM, X3, IEEE, GCA, NIST, or some 
other ANSI accredited organization were discussed, but with no definitive conclusion. 
Interested individuals were encouraged to pursue possibUities within these organizations. 

Major conclusions of the workshop were that the individual discussion groups should 
continue their respective technical efforts, possibly via private communications, and that 
NIST should sponsor at least one more workshop to provide a forum for public discussion 
of progress. A decision could then be made a^ to the desirabUity of establishing a more 
fonnal standardization group with status in some ANSI accredited standards organization. 



Leonard Gallagher 
Workshop Chairperson 



REPORTS OF DISCUSSION GROUPS 



This section of the proceedings contains the reports as submitted by the presentors of the 
discussion groups. The material was presented at the closing plenary of the workshop. 
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/ . HYPERTEXT MODELS DISCUSSION GROUP 
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Presentors: 
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Reports of this group follow: 
. Reference and Data Model Group: Work Plan Status 
. Reference and Data Model Group: Comparison of Three Models 
. Reference and Data Model Group: Responses to "Issues for Discussion Group 
Consideration" 
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Reference and Data Model Group (RDMG): 
Work Plan Status 



Rf'i^orted by 
II. Van Dyke Panuuik 
Industrial Tochnology Institute 

January 26, 1990 



Abstract 



A roforcnco inod.-l is a strnctnrod doscriptioi. of soino <lcnnaiii tliat can bo used to compare existing imple- 
mentations in tliat <lomain, design new implementations, and (most important for onr purposes map on t 
possible areas for stan<lar<li/,ation and show their relation to one another. The main output of the KD\K, 
dnring the NIST workshop was a work plan for arriving at sncli a reference model. The work plan that 
we propose lias the following structure, where the flow of activity is down the page (except for the single 
feedback loop), and where activities marked by received significant attention during the workshop, 



V 



V V 

♦Define ♦Brainstorm ♦Compare Existing 

"Hypertext" Concepts Models (DTL) 

\ I / 

\ V / 

♦Organize Ontology 
I 

V 

Rank Concepts by Centrality 
I 

V 

Inventory Existing Systems 
I 

v 

Construct "Implementation" Model 
I 

+ 



v 

Select Areas for Standards 



Tiie rest oftiiis document delines each of 1iies<< sl<'|.s. a.ul reports wiiat we have done in eacii of tlietii. 

This document Humii.ari/.es the portion n\- ih,- final \\m\i\ i^resentat ion that I .h^livered on 18 January 
!<);)(). It reiu'esents my pern ptinn of the deliberations of Mu; group, but has i ot been reviewed or formally 
approved by the other iiieiiif)ers. 



1 Define 'Hypertext' 



This definition is intended to be a brief, succinct statement of our domain, to provide some degree of focus 
during subsequent stages. It may well change considerably as a result of later analysis. We began with 
a definition that has been circulating for several years, and modified it to reflect the valuable distinction 
between 'hypertext' (as a structured body of information) and 'hypertext .system': 

A Hypertext is a network of information nodes connected by means of relational links. 

A Hypertext System is a configuration of hardware and software that presents a Hypertext to users and 
allows them to manage and acce.ss the information that it contains. 

2 Brainstorm Concepts 

In an efi'ort to scope our discussions, we brainstormed terms and concepts describing hypermedia, and 
assembled a list of about 80. These are listed in more organized fashion below. 

3 Compare Existing Models 

In order to build on existing work, repre.sentatives of three detailed models presented at the workshop (the 
Dexter model, the IVellis r-model, and Danny Lange's model) compared and contrasted their respective 
models. A separate report by John Leggett summarizes those discussions. 

4 Organize Ontology 

We attempted to organize the set of terms and concepts to bring like things together. This section reviews 
the resulting taxonomy of concepts, and then describes some further analysis that might be conducted to 
organize the list even further. By it,self, this organized list is a Pmited reference model. Subsequent steps 
refine it and seek to cast it in a form that ha.s been useful in the past in guiding the development of standards. 

4.1 A Preliminary Organization 

We found it u.seful to .sort the concepts produced by braiiislormirig into three main categories: Entifies, 
Properties, and Functions or Operations. Some concepts did not seem to fit cleanly into any of these, and 
were relegated to a catch-all category, Abstractions. 

Eutltlfis These are the objects that a hyjjertext system must manipulate; together, they make up a hy- 
pertext. 
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Components, each with a UID (unique ID) 

- Link or relationship; may be warm, hot, aLstract, dynamic. 

- Nodes; can have fields, contents, anchors/buttons/interactors/link markers 
- Composites, inclnding idioms, paths, tours, webs, networks 

♦ Whoh^ documents, also with UID's (container, stack, frame s^^t, guideline) 

♦ Navigational aids, including index, map, table of contents, fislieye view 

♦ Display entities: window, canvas. Card vs. scroll distinction appii(^s l^^re. 

♦ Functional stuff: presentation specification; resolver. 
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Properties These can be either of entities or of the entire system. 

. Properties of Entities (should probably be merged with the Entity term list) 

- Attributes (of nodes and links; includes temporal and display behavior) 

- Component format and structure (e.g., locktext) 

- Network topology (e.g., hierarchy, hypercube, DAG) 

- Size of canvas (scroll vs. card) 
• Properties of the System 

- Concurrency, including both multiuser and multithread 

- Synchrony 

- Existence of a formal model 

- System performance (e.g., speed) 

- Timing (e.g., to support music, animation, and video) 

- Distributed vs. local 

- Monolithic vs. open (as in a link service or link protocol) 

- Referential integrity (are dangling links permitted?) 

- Context sensitivity 

- Interoperability 

- Operating modes (browse, author, ...) 

^„c,,io„. initial attempts to classify th.se further -e u— A-lMi^ ^ J— a, d^s- 

contents of each group: 

• Knowledge modification 

- Modifying system knowledge in place: edit (including cut/paste and structured editing), update, 

annotate r * f 

_ Move information into or between systems: interchange; conversion and parsing of raw text 

• Navigation 

- Search and query; need for managing relevance of search; filters 

_ Browsing semantics (progressive disclosure; histories; views; path macros; bookmarks) 
_ Support tools: scripting, addressability, triggering (actions to take on arriving and departmg a 
node) 

• 'Yucky Systems Stuff 

- Tailoring 

- Interfaces, of two sorts: 

* Forpign nodes (application programs that can be activated at a node ; API s 
I Communicauons protocols (between separate programs at the same layer) and services (be- 
tween layer:> of a single program) 
^ Versioning, jourualing 
Access control 



t^ve'l^ ,r ' '"''^"'^ " """^^^^ °^ '^'^"'^ to fit elsewhere. Alter- 

native titles for this group of terms are 'metadata' and 'implementation tools.' 



• Schema 

• Typ. 

• Class 

• Object 

• Data models (E-R, semantic) 

• Encapsulation 

• Layer 



4.2 Further Organization 

One can go further (though we didn't have time). For example: 

• S::!::;"^;r:pe;:;r'^"^ ^ " ^^'-^ ^--^-^ - --^^^ - -pp°- ^^nat 

. Developing an 'Entities x Functions' relation to show what entities support what functions. 

5 Rank Concepts by Centrality 

In choosing areas for standardization, we want to focus on those topics that are characteristic of most or all 
hypermedia systems, and not on tho.se that appear only in a few systems for special p rfoses Th ilnt 

6 Inventory Existing Systems 

One important use of a reference model is as a guide to comparing systems, and a test of the model that 

7 Construct 'Implementation' Model 

The objective here is to <lerive a layered model, like the OSI reference model, i„ which the layers renresenf 
successive functionality adde<i to a core with hardware at the bottom ^ ' 

Jl.e group expressed some difference of opinion on whether OSI i.. u good example of what we want 
An interesting d.sn..ss.on within the group centered on whether a monotonic layering from h rd a e to 

z!r:;;Lrtis: in": - - --"^ ^^-^ - 
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STACKS 



TASK- Store I Process I Present I User 

I I 1 

LAYERS: Kode.Link I Navigate I Window. Button Concept 



OODB I 
I 

File System I 
I 

DEVICE: Disk I CPU 



Virtual 
Terminal 



CRT/Keyboard 



Eye/ 
Hand 



MEDIUM: 



\ / 
Bus 



\ / 
LAN 



\ / 
EM Radiation 



The layers listed in this diagram are incomplete, h,.t illustrate the diflerenre between those that are 
central to hypermedia and must he described in our model (above the dashed line), and those that should 
be developed in other disciplines (below the line). What is critical for our purposes is the clear dehn.t.on of 
the services that connect one layer to another. 

8 Select Areas for Standards 

Once developed, a referenre model helps map out areas for standards. Focus i.s impcM-taut here, and the 
model helps provide it in two dimensions. The ranking of concepts in the ontology shows how central each is 
to hypermedia, and helps us focus on standardizing those concepts mosi likely to be of widespread use. l lie 
implementation model helps us identify which concepts are best standarrli/ed in other research comrmm.t.es 
(such as cm 1)B OOPS, windowing systems) and which requir- the focnserl attention of researchers ni 
hypertext. ( Iraphically, the focussing proce.ss seeks to identify the region 'X' in the diagram below for 
standardization. 



CKI I I 

+ ~V 

Which HI I X <--- 

Community? 

DB, I I 
OOPS I I 

+ 

Few HI 

All HT '^^^ 

^ ^ Systems 

Systems 

How central is it to hypertext? 
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Reference and Data Model Group: 
Comparison of Three Models 



John J. Leggett 
Department, of Computer Science 
Texas AkU University 

The Reference and Data Model working group spent 45 nmu.tos comparing and contracting the R-n;°del\ 
D xter^ Tnd Lange^ reference models. David Slotts, Danx.y Lange and John Leggett spent ano her 90 
mtnutL over dinner discussing the three models. A summary was provided by John Leggett ^urmg he final 
plenary session. As these three models are currently under development, the comparisons are ra her broad 
TnZo It IS interesting to note that the three models were developed mdependently and with varymg 
levels of collaboration. The results of these discussions are presented below m mostly tabular form. 

Differences 

Type Links Anchors Formalized? 

R-model Meta-model for No links, but No distinct No 

systems specification relations defined anchors 

Lange Model of hypertext Allows dangling Anchors and Yes, in VDM 

links regions 

Dexter Model of hypertext Does not allow Anchors Yes, in Z 

systems dangling links 

Similarities 

Support for types in all three models is through arbitrary attribute/value pairs. 
All three models have separated conient, structure and presentation: 

Content Structure Fcesentation 

R-model Abstract content Structure and Concrete and 

abstract containers visible levels 



Lange 



Schema Networks and Unspecified 

structures 



Dexter Within-component Storage layer Run-time layer with 

present atiovi specifications 



•■'Danny B. LaiiRc, "A Formal Modd of Hypertext,- tliwc proa-odm^^. 
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Hypertext Reference Model Group 
Responses to "Issues for Discussion Group Consideration" 

James Biack 

1. What is the current state-of-affairs in this topic area? What is likely to happen in the 
near future? 

The Reference Model Working Group did a reasonably thorough examination of three 
independently derived hypertext models and identified no essential inconsistencies which 
would preclude eventual consensus. Each of the three models was the product of a 
different analytical approach and there remain significant areas of confusion and lack of 
current consensus which seem to largely due to syntactical differences. .Further open 
dialogue among the participants would improve this situation. 

2. Are emerging technologies driving this topic in a certain direction? Is there sufficient 
stability to warrant further pursuit of standardization at this time? 

The sessions revealed no clear evidence that "emerging technology" was driving any 
aspect of the hypertext concept in a particular direction. The only indication of any 
"driving forces" which may be prematurely affecting aspects of the evolution of hypertext 
technology are related to other standardization efforts, specifically, ODA and 5G@. There 
does seem to be sufficient stability in the shared understanding of basic hypertext 
concepts to warrant further pursuit of standardization. 

3. What are the most important concepts? Ale there agreed definitions? Is there a glossary 
available, or set of candidate key words? 

The essential concepts of hypertext would include a data model with the following 
features: 

• data type and media independence 

' "format" and "content" independence 

• freely defined, relational links between freely defined data elements 

• no inherently hierarchical structure 

• distinct separation of format and content 

They would also include such functional features as navigational, authoring, presentation, 
and systems management tools. 

4. What is the interdependency of this topic area with other topic areas identified at this 
workshop? 
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There is a need to develop a glossary and taxonomy of hypertext terminology which 
includes fonnal, (mathematical) definitions where available. There is available a core set 
of candidate key words. 

5. What are the major problems and controversies? Is compromise possible? or would 
alternative approaches better serve the vendor/user communities? 

There is significant interdependency between the hypertext reference model and 
system, interchange issues. 

6. What is the ultimate goal for this topic area? a user guideline? a domestic standard? an 
international standard? something else? What is an appropriate sequence of steps leading 
to this goal? 

The ultimate goal of this working group is to establish a hypertext system reference 
model and use it to establish a hypertext glossary and taxonomy and to identify candidate 
areas for standardization activity. 

7. What concepts in this area are appropriate for standardization? What concepts are not 
appropriate for standards? What can inhibit the development of standaids? Is somediing 
ready for standardization at this time? 

There are no areas ready for standardization at this time. 

8. What role can NIST play in achieving the goals of this topic? Are further workshops 
desirable? What is the most appropriate follow-on activity after this workshop? 

NIST can establish a formal, on-going hypertext study group that publishes consensus 
findings and recommendations which NIST links to relevant standards organizations. 
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2. DATA INTERCHANGE DISCUSSION GROUP 



Moderator: Len Gallagher 
Presentor: Tim Oren 
Scribe: Jan Walker 



Rob Akscyn 
Gregory Crane 
Valerie Florance 
Edward A. Fox 
David Fristrom 
Len Gallagher 
Steve Newcomb 
Charles Nicholas 
Tim Oren 
Kenneth Pugh 
Victor Riley 
Jan Walker 



Reports of this group follow: 

• Summary of the Hypertext Interchange Group 

♦ Note on Representing Anchors 
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Summary of the Hypertext Interchange Group 



The hiterchange Group first discussed how the problem could be partitioned. We agreed 
that ideally the representation of the data and its presentation to the user should be 
separated. However, for efficiency .easons most existing hypertext systems which support 
graphics in fact store bit maps and specific screen coordinates. This is an obstacle to 
interchange between platfonns with differing display architectures. 

We also made the distinction between a "delivery interchange" standard and an 
"archival interchange" standard. A delivery interchange standard would be directly 
usable by a conforming hypertext system without translation. We regarded this a.s very 
difficult to achieve in the short tenn due to differences in hypertext systems' methods of 
storing and indexing their data, which are usually highly optimized for the particular 
platfonn and application. The dependence on display formats already noted is also an 
obstacle to a delivery interchange standard. 

An "archival interchange" standard is one in which the information owner may store 
hypertext in a system independent fashion. For actual delivery either the information 
owner or end user would need to translate the archival interchange format into a format 
specific to a particular hypertext soft ware^ard ware configuration. Any changes authored 
by the end user would have to be rolled back to the archival store before reaching other 
platfomis, rather than iitempting direct interchange. We agreed that this goal was more 
achievable in the short run. and turned our discussion in this direction, but without 
disputing the eventual value of a deliver> interchange fomiat, or the need for further 
experiments with delivery to define requirements for the archival representation. 

We proceeded to compare relevant interchange proposals from the working papers or 
which were otherwise drawn to the attention of the group. These included a discussion 
paper submitted by Ken Pugh, Victor Riley's Intennedia exchange paper, portions of thc 
HyTime proposal, and the so-called "HIP" Hypertext Interchange Protocol developed at 
Apple, Xerox P.AJ^C and Brown IRIS. A copy of the HIP paper wa^ supplied by Victor 
RiJey of Brown IRIS. The group voted to request that the HIP paper (Bomstein and Riley. 
"Hypertext Interchange Fomiat") and releviint sections of HyTime (Newcomb, 
"Explanatory Cover..." and Section 7.2) be included in the final Proceedings of the NIST 
Workshop. 

Comparing these formats showed that all were adopting a partitioning of the problem 
into data objects, anchors, and links. Anchors fomn the data object ty^pe specihc endpoints 
for links. While there were abundant differences in terminology, a first reading showed 
basic conformance to this layering, and we agreed that this should be drawn to the 
attention of the modeling group. 

It v/as also noted that most of the interchange proposals used SGML or SGML-like 
markups. After some discussion, it was agreed that SGML was a reasonable basis for 
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further interchange experiments. This position is adopted without prejudice to an 
evenmal standard, due to a number of participants' concerns about technical issues (e.g., 
efficiency, limits of a parser driven implementation), and prejudgment of the decision 
process. We agreed that documents resulting from these discussions should be conveyed 
to the HyTime (ANSI X3Vi.M8) committee for inclusion in their working document set. 

A general discussion of related standards ensued. There was consensus that wherever 
possible hypertext interchange standards should incorporate existing media type standards 
without requiring changes in those standards. 

An ad hoc group composed of Ed Fox, Steve Newcomb, Tim Oren, and Victor Riley 
met during the evening to continue the comparison of 'me various interchange proposals. 
They reported to the whole group that they had succeeded in a first pass reconciliation of 
the anchor levels of HIP, Intennedia and HyTime. Their notes are appended in the 
interchange section of the proceedings (under the title "Note on Representing Anchors") 
rather than incorporated here, as they were not a result of the entire group. 

The whole group strongly suggests that further experiments with interchange between 
existing systems be undertaken. We noted the need for a publicly available, editorially 
controlled document set for this purpo.se. This should be in the few hundred to few 
thousand node size, marked up in SGML with linking infomiation provided. Further 
volunteers and funding for these experiments are an issue. Availability of a free or 
inexpensive SGML parser is required if universities are to participate in the experiments. 

We identified a number of significant issues which \ .re not addressed due to time 
constraints: 

• Making a complete list of relevant data type standards 

• Requirement for unique naming and identification services, which is a problem with 
wider scope than hypertext alone. 

• Link typing, type definition and hierarchies, N-way link structures 

• Composites - a taxonomy of existing uses and representations 

• Versioning 

• Representation of time-based med'a, e.g., music, video, and links conveying liming 
infomiation 

These should be addressed in further .sessions, as they all influence requirements for an 
interchange standard and some (particularly link typing and composites) are the subject of 
active research and controversy. 



Submitted by Tim Oren 
January 24, 1990 
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Note on Representing Anchors 
Reponed by Tim Oren 



An ad hoc subgroup of the Interchange working group met to compare various proposals 
for archival interchange. It was composed of Ed Fox, *?t»>ve Newcomb, Tim Oren, and 
Victor Riley. These notes are the result of that meeting. They are a first pass which has 
not been considered by any other group. See the summary of the Interchange group for 
context and definition of terms. 

We chose to proceed by focusing on the anchor or "anchor-like" portion of each 
proposal. We began by considering how the features of the Intermedia Interchange could 
be added to the HiP proposal, and expressed the result in HIP-like terms. We then 
attempted to reconcile this re^iult with the formalism and language of the pertinent 
sections of HyTime. Note that this applies only to anchors, and there may be additional 
difficulties in reconciling layering strategies when we look at the link layers of the various 
proposals. 

1. Reconciliation of Intermedia exchange and HIP 

This is a semi-fomial prei>entation of patches to the <ANCHOR> section of the HIP 
specification. The other sections of HIP have not yet been brought into conformance: 

<NAME> - optional, ASCII string, user displayed or for use of system. Usage ideas, this 
could be the name of a hypercard button, or used as a item for searching, or as comments 
to be displayed as preview. 

<ID> - required, a unique ID in a fonnat TBD. Uniquely identifies this anchor within the 
scope of the interchange set. 

<CREATION> - optional. 

<WHEN> - Date/time of creation in a standard form TBD. Indicates the moment 
of original creation of the anchor (even if it was later moved). 

<BY> - the urJque ID (TBD) of the user/authority who created the anchor. 

<M0DIF1ED>* - optional, optionally multiple. 

<WHEN> - Date/time of the particular mcKlify. It is a application policy matter 
whether all, just the latest, oi no mods are recorded. 
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<BY> - the unique id of the modifying user/authority. 

<VERSION> - a unique id of the referenced version. How to use this is a policy 
matter of the system. If it's the same as the <ID> of thi-> anchor, this is the current 
version. 

<LOCATION> - required. 

<ANCHOR-OBJECT-ID> - required. The unique ID of the data object (file - 
chunk - whatever) to which this aiichor refers. 

<ANCHOR-VALUE>+ - object type specific, required, optionally multiple. Note 
that this could refer to multiple selections, elements, etc. within the data object. 

<PRESENT-SPEC> - object type specific, optioniil, regulates how the anchor is to 
be presented, e.g., run the sound editor or play the sound, positioning infomiation 
for the 3-D editor view of an IGES object. 

2. Reconciliation with HyTinie tenninology (sections under 7.2.5) 

HyTinie as written contains within its "location" layer infomiation which is both generic 
to the concept of anchor, and specific to certain data types. We try to separate this here. 
Again, this has not been reconciled with the link layer of HyTime or HIP and problems 
might emerge there. 

The general concept of "entloc" corresponds to the HIP <ANCHOR> idea. The ID within 
entloc corresponds directly to the <ID> in HIP. The "dataent" corresponds to the 
<ANCHOR-OBJECT-ID> of HIP. 

Notation Data Location (ndloc) is HyTime 's generic anchor, corresponding directly to the 
HIP constmct above. Its type specific part is represented in the "fonnula," which 
corresponds to the <ANCHOR-VALUE> of HIP. "Snap" should probably be considered 
part of a type-specific constmct rather than part of a generic anchor. HIP would probably 
represent it as part of the <PRESENT-SPEC>. A reasc.iable default data type is 
undifferentiated byte stream. 

The other location constructs are viewed as data type specific iinchors. 

Character data set location (cdloc) is an iuichor into sequences of ISO defined characters 
(NB; this is not the same thing as a font or byte sequence). 

Document locations (elemloc) (7.2.5.2-3) are the SGML object type specific ;inchor 
definitions. Element location is SGML type specihc and identifies a single "node" within 
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the hierarchical structure created by an SGML markup. This may be specified using an 
ID, if one exists for the node, or using a path designator from the root. Point location 
allows anchoring to a spot within an element, 

All of these constructs might be further generalized by allowing multiple "selections" to 
be incorporated within one "location." 
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REPORT FROM THE USER REQUIREMENTS WORKING GROUP 



Robert J. Glushko 

Search Technology 
Norcross, GA 

This report summarizes meetings held on January 16-17, 1990 during a workshop on 
Hypermedia Standardization held at the National Institute of Standards and Technology in 
Gaithersburg, MD. In addition to the author, the members of the Working Group for User 
Requirements were Carol Adams, Peter Aiken, Jean Baronas, Denise Bedford, Tim Bemers-Lee, 
Valerie Florence, Kevin Gamble, Louis Gomez, Seymour Hanfling, Kathryn Malcolm, Cathy 
Marshall, Fontaine Moore, Dan Olson, Duane Stone, Clifford Uhr, David Wojick and Don 
Young. The group followed an agenda set by NIST *.o identify the current state of affairs, 
important driving and constraining factors, potential areas for standardization, and research 
needs. 

Complete consensus on these complex topics was impossible in two days for a group this 
size, so this report emphasizes the majority themes for the issues that received the most attention. 
I apologize for my own biases, which undoubtedly show through. 

THE CURRENT STATE OF AFFAIRS FOR HYPERTEXT 

In recent years hypertext concepts for making information more accessible and usable 
have been applied to a bewildering variety of applications: 

Reference books, encyclopedias, dictionaries 
Library collections and archival literature 
Online software reference manuals 
Policies, procedures, regulations 
Maintenance and diagnostic information 
Online help systems and embedded training 
Education, tutorials 
Engineering and CAD 
Professional project management 
Collaborative problem-solving and authoring 
Interactive fiction, entertainment 
Museum directories and information kiosks. 



Four basic factors appear to account for the rapid spread of hypertext design concepts, 
ihese are enabhng technology, documentation standards initiatives with hypertext implications 
market pressure, and academic interest. 

Enabling technology. Hypertext applications require a significant amount of local 
processmg power and storage capacity that until the mid 1980s was not readily available 
Hypertext (and espec-ally hypermedia) applications are also benefiting from increased data 
transfer capabilities enabled by advances in data compression, fiber optics, and progress toward 
an end-to-end digital telecommunications network. Nevertheless, having the delivery and 
storage technology base for hypertext systems would have been meaningless without the 
concurrent maturation of user interface design concept^ and fools. Object-oriented programming 
and prototypmg toolkits that embody direct manipulation user interface concepts make it 
possible to design and implement the rich functionality of hypertext systems in a cost-effective 
way. 

Documentation standards initiatives with hypertext implications. Some major 
standards efforts in related areas have made hypertext both more necessary and more likely. The 
first of these is SGML, the Standard Generalized Markup Language [7]. In 1986 SGML became 
an international standard (ISO 8879) for defining the logical structure of printed documents 
independently of their appearance. While there is no agreement that SGML is the optimal 
starting point for a hypertext standard, there is little dispute that SGML's system-independem 
markup makes it significantly easier to exchange and process electronic documents and hence, to 
combine them into hypertext documents. 

A second major stand^j-ds initiative that is emerging as a driving force for hypertext is 
CALS, the U.S. Department of Defense program for Computer-Aided Acquisition and Logistic 
Support [3]. CALS has as its goal the creation of a "paperless environment" with the integration 
of the various "islands of automation" that participate in the system design, development, 
deployment, and maintenance processes. In February 1988 the CALS program adopted SGML 
as a military standard (MIL-M-28001) for the digital form of traditional printed documents, but 
new standards for creating, exchanging, and delivering information are evolving that completely 
do away with any notion of "printed page." Since so many companies do business either directly 
or indirectly with the Department of Defense, the scope of CALS will be enormous. The 
obvious benefits of digital information exchange throughout the entire government are causing 
CALS concepis and requirements to spill over into other pfirts of government. 

Market pressure. Programs that called attention to their hypertext features had started to 
emerge in the mid-1980s, but since the release and aggressive marketing of HyperCard by Apple 
Computer in 1987, dozens of other software products that claim to provide hypertext and 
hypermedia capabilities have entered the marketplace since. 

Academic interest. Finally, substantial academic interest in hypertext issues has 
emerged in the last few years. In late 1987, approximately the same time as the introduction of 
HyperCard, a conference was held at the University of North Carolina that was the first 
academic rally of researchers and system designers under the hypertext flag [1]. Since then, 
similar conferences have been held in Europe [9] and a second major conference on liypertext 
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was held in Pittsburgh in November 1989 [2]. At least one new professional journal has been 
established with "hyper" in its name [6]. 



ERIC 



THE FUTURE 

The 1990s will see ubiquitous software and hardware support for hypermedia 
applications in "off the shelf computing environments. Computer hardware, software, and 
telecommunications companies will develop business strategies and product lines for multimedia 
systems, applications, and services. 

It is already readily apparent that no single hypertext design or hypertext software is 
appropriate for all applications or users. However, guidelines or standards for choosing design 
approaches or software tools are hard to apply without a framework for understanding the range 
of possible applications into which hypertext solutions might fit. 



NEW VIEWS OF THE HYPERTEXT "DESIGN SPACE" 

Nevertheless, the classification scheme for hypertext applications that this paper began 
with is too arbitrary to serve this important purpose. That scheme loosely categorizes hypertext 
applications according to the kind of information they contain, but has no rationale for defining 
the categories. Why aren't encyclopedias and dictionaries in their own categories? Shouldn't 
training and education be together? Clearly, a more abstract and robust scheme is needed for 
comparing, understanding, and generating hypertext applications. The working group discussed 
several alternative views of the "hypertext design space." 

Dimensional view 

An alternative that I have been developing is based on four non-orthogonal dimensions: 

User dimension: single users vs. groups vs. multiple unrelated users. Hypertext 
systems can be designed for single users, groups of users working collaboratively, or large 
communities of unrelated users. 

Information dimension: creation vs. conversion. Hypertexts can primarily contain new 
information created for the application or information obtained by converting information that 
already exists in conventional printed form. 

Task dimension: task-specific vs. general. Hypertext systems can be designed to 
support specific tasks or as general-purpose environments for building other hypertexts. 

Interface dimension: static vs. dynamic. Hypertexts can be primarily static archives for 
read-only browsing, can be relatively transient databases of periodically-published mformaUon 
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like news articles or product catalogs, or dynamic to support continuous collaborative authoring 
and commentary. 



To edit, or not to edit? 

An alternative framework for understanding the hypertext design space was proposed by 
Carol Adams. Her view is that all hypertext applications can be partitioned according to whether 
0. not they allow users to edit the content of the basic hypertext units and the links between 
them. These two orthogonal dimensions yield four cells into which existing and potential 
hypertext applications might be categorized. 

The two clearest categories in this framework are applications in which both units and 
links can be edited, and "read-only" or pure "browsing" applications in which neither can. 
Applications of hypertext to software design or concurrent engineering domains might embody a 
fixed structure between unit templates and thus primarily support unit-only editing. Finally, 
applications that involve primarily link-only editing with permanent units might include archives 
or literary criticism. 



SPECIFICATION OF HYPERTEXT FUNCTIONS 

Standards for the appearance of hypertext user interfaces may not even be possible and 
are certainly premature. The range of applications that call themselves hypertext and the wide 
assortment of user interfaces they contain clearly argue that at best, subsets of standards or 
standards "families" would be appropriate. However, the working group concluded that users 
and application developers would benefit immediately from shared definitions and specifications 
for hypertext functions. "Functions" are defined here as operations carried out by a hypertext 
user interface on the entities managed by the hypertext storage layer [5]. 

The goals of specifications for hypertext functions are straightforward. They must: 

a) fit clearly into the hypertext reference model, 

b) be independent of presentation specifications, and 

c) unambiguously define the operational semantics. 

If these goals can be satisfied, perhaps standards for hypertext functions can emerge that 
can be organized into consistent subsets for different parts of the hypertext design space. Then, 
the interoperability of hypertext systems in the same region of the design space can be defined in 
terms of these functions. The working group began this ambitious effort by creating a list of 
functions and crudely separating them into "authoring" and "reader" subsets. No claim is made 
that these lists are complete. 

Authoring Functions 

1) Create (unit, link, composite) 
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2) Edit (unit, link, composite) 

3) Delete (unit, link, composite) 

4) Publish (unit, link, composite, hypertext). "Publish" means to give a hypertext 
component a degree of permanence in some current version or configuration of 
the storage layer. 

Reader Functions 

1) Indicate current unit 

2) Move to another unit 

a) defined spatially (e.g., arbitrary new location in display) 

b) defined syntactically (e.g., in order -- "next," "back") 

c) defined lexically (e.g., unit name contains string "x") 

d) defined semantically (e.g., unit of type "x") 

e) defined temporally (e.g., previous current unit) 

3) Indicate presence of "expandable" structure 

4) Indicate whether currently expanded 

5) Expand current unit 

6) Close current unit 

Annotation Functions 

7) Create annotation 

8) Edit annotation 

9) Delete annotation 

Bookmark Functions 

10) Create bookmark 

a) implicitly when in unit 

b) explicitly by user action 

11) Delete bookmark 

12) Move to "book-marked unit ' 

Functions on Virtual Structures 

13) Search (scope, specification) 

14) Define session (history, bookmarks, annotations) 



15) Save session 

16) Restore session 



Miscellaneous Functions 



17) Print (Unit, link, linearization) 



Specifying Functional Semantics 



of u 7^ "'^^"^ "^^^ accompanied by precise definitions 

of what they mean and the rules by which they can be combined. There are many notations for 
specifymg the semantics of functions (e.g., [4]), but I will use an informal approach here that is 
s^dfi^adon^^ rudimentary levei of the working group's progress in developing the 

For example, BACK (NEXT (X)) = X defines the meaning of "NEXT" and "BACK" 
functions in ? hypertext system as follows: if a reader navigates from a unit X using a "NEXT" 
function, the "BACK" function returns to the starting unit X. 

Similarly, DELETE (CREATE (X)) = CREATE (DELETE (X)). 

But, DELETE (PUBLISH (CREATE (X))) is not equal to DELETE (CREATE (X)) 
because the intervening "PUBLISH" function defines a different version or configuration of the 
nypcricxi. 



RESEARCH AGENDA 

The working group concluded that research is needed in many cases t- ' -H define the 
appropriate semantics for hypertext functions, and it would be appropriate for ^ -.^ conduct 
sponsor, or encourage this research. Research is also needed tc define nev measures for 
hypertext that describe characteristics relevant to user performance. This research agenda should 
include research into these areas: 



Evaluating "hypertextability." While there are informal guidelines for determining 
whether a particular document or document collection is suitable for conversion to hypertext 
more reliable and objective measures are needed. "Hypertextability" can potentially b^ 
characterized by aspects of the logical structure of a document, such as the number, size, and 
relationships of the information units. 



Validation of hypertext conversion. Measures of hypertextability will also be 
invaluable in hypertext projects for estimating the resources required and estimating schedules. 
Corresponding methods and tools for measuring the "amount of hypertext" that has been 
successfully converted should follow; perhaps hypertext sets of links can be evaluated using 
analogues to the familiar ideas of "precision" and "recall" in information retrieval. 
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Measuring hypertext "readability." Readability formulas for ordinary text based on 
sentence length, word hngth, or other characteristics have been a continuing subject of research 
[8]. Hypertext extensions to readability metrics might include measures of the "goodness" of 
links based on similarity between linked units. Readability measures for alternative hypertext 
designs for the same text will go far toward making hypertext design an engineering discipline. 

A fmal research area identified by the working group where progress will immediately 
benefit users involves intellectual property issues for hypertext and hypermedia. The rash of 
"look and feel" copyright infringement lawsuits and similar claims for software patents confront 
software designers and developers with chaos, uncertainty, and legal action [10]. But as unclear 
as the situation is for software in general, tiie novel character of hypertext and hypermedia 
software raises still more complexities for intellectual property law. For example, if copyright 
law has different rules for "literary works," "audiovisual works," "sound recordings," and 
"pictorial works/' into what legal category does an interactive hypermedia encyclopedia or a 
talking book fall? Are new links or notes in a hypertext system considered "derivative works" 
under copyright law? These and otiier issues arc not just legal curiosities -- they will have 
considerable impact on tiie legal protection available and hence the economic viability of 
hypermedia systenu. 
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PAPERS 



This section of the proceedings contains the tv/elve contributed papers which w( 
accepted for publication and featured during the plenary session on the opening day of l 
workshop. It also contains the two papers which the interchange group recommended 
added. 
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Hypertext Interchange Format 
- Discussion and Format Specification 
DRAFT 1.3.4 
jeremy bomstein 
victor riley 



The Hypertext interchange format described here is based on the work 
of the Dexter group, an industry coalition of hypertext researchers interested 
in a standard for hypertext data exchange. This paper describes the result of a 
collaboration towards this end between Jeremy Bornstein and Frank Halasz, 
with significant input from other members of the Dexter group, most notably 
Tim Oren. The work took place during the summer of 1989, and a 
demonstration is planned for the Hypertext *89 conference in November of 
1989. 



background and rationale 

The number of hypertext platforms is increasing, not decreasmg. 
Although this development will most likely settle down to a stable state, it is 
almost certain that no one platform will dominate the hypertext world to the 
extent that nobody at all will use an incompatible platform. Nevertheless, 
large bodies of hypertext data are being developed in systems which will 
either die or evolve. An interchange format allows users on separate systems 
to share their data, thus eliminating the need to acquire, learn, and use a new 
hypertext system only to access that system's data. 

Of course, in order to propose a reasonable interchange format, the 
structure of the data must first be determined. As it happens, with regard to 
hypertext this is by no means a closed issue. The Dexter group made the 
decision to describe a format which would be able to include everyone's 
definition of hypertext and thereby short-circuit "rathole" debates about the 
nature of hypertext, instead focusing effort on the structure of a given 
system's hypertext. The framework, described below, attempts to be an 
inclusive definition xdther than an exclusive one. 
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generalities 

The format is an ASCII format, as opposed to a binary format 
Conversion to a binary format is possible if desired, but a text format is much 
easier when the definition of the format is still evolving. 

The appearance of the format is similar to that of SGML^: there are tags 
marking the beginning of a hierarchical section and tags marking the end 
( begin-tags and "end-tags"); the end-tag corresponding to a given begin-tag 
has a backslash ("\") in front of the name for the begin-tag. Tags appear 
between greater-than and less-than signs ("<" and ">"); if the greater-than 
sign appears in the data, it is doubled ("«"). The order of the children of a 
given tag is irrelevant^. 

Tags which are not understood by a parser are guaranteed to be ignored 
by that parser. In other words, if a particular system exports information 
which no other system understands (yet), then this will not cause another 
parser to crash, but merely render an incomplete version of the document. 

The characters A-Z, a-z, 1-9, and the underscore ("_") are the only valid 
characters which may be used in the name of a tag. Case is not significant. So 
tar the agreed-upon conventions are that tags begin with a lower case letter 
and that words after the first are marked by capitalization of the initial letter 
For example, "thisHasFourWords" is a tag name which adheres to these 
conventions. 

Whitespace, when it appears outside of the data belonging to a bottom- 
level tag, IS not significant. Often in examples, a single space character is 
added after bottom level start-tags and before the corresponding end-tags, but 
this whitespace is not in the actual export files. The indentation which 
appears in examples is also not part of the format, but it should not cause an 
interchange-format parser to fail. 

Since many references in a hypertext environment will take place 
across "document" boundaries, it is necessary to be able to reference many 
objects from a global standpoint. In order to make this independent of file 
name and directory position, global IDs are used. So far, the numbers are 64 
bit numbers which may be chosen by any method, preferably including at 
least some random bits. Eventually this may be changed in favor of some 
method which better ensures uniqueness of each identifier. 



specifics 



JSGML Standard Generalized Markup Language 
•^That is, the following two expressions are equivalent: 

• <foo> <bar> 128 <\bar> 

<baz> 256 <\baz> <\foo> 

• <foo> <baz> 256 <\baz> 

<bar> 128 <\bar> <\foo> 
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This section is a rather humorless and redundant description of the 
data format. It might be more efficient to read the sample file first and then 
refer below for confirmation and clarification of your understanding. The 
description which follows is hierarchical, as is the interchange format itself. 

<DOCUMENT> 

The outermost tag in a HIP-format document is the <DOCUMENT> 
tag. The <DOCUMENT> tag has four possible types of children: the 
<HEADER> tag, <NODE> tags, <LINK> tags, and <COMPOSITE> tags. 
<HEADER> 

The <HEADER> tag contains relevant information about the 
document as a document: the name, the unique id, which 
system it was exported from and on what date. 
<NAME> 

This is the name of the document in the originating 
system. The name is primarily for display to the user, but 
it is possible that it could be used in trying to resolve links 
as well. 

<ID> 

This is the unique id of the document, following the rules 
for ids given above. 
<EXPORTED> 

This tag contains information about the originating 
system and when the document was exported from that 
system. 
<FROM> 

This is the name of the originating system. 
<DATE> 

This is the date on which the document was 
exported. A standard format for the date has not 
been agreed upon. 
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<ACCESS> 

These are the access rights for the document set. In the 
case of Intermedia this is the web, for NoteCards this is the 
NoteFile, for HyperCard tliis is the stack. No format has 
been agreed upon. 
<CREATION> 

The <CREATION> tag tells the time of creation and the 

creator for the document. 

<BY> 

This is the creation author. 
<DATE> 

This is the date which the document was created. 
<MODIFIED> 

The <MODIFIED> tag tells the time of modification and 
the modifier for the document. A set of these can tell 
history for changes. 
<BY> 

This is the modifier author. 
<DATE> 

This is the date which the document was last 
modified. 

<NODE> 

The <NODE> tags in a document function as the wrappers for 
the text/graphics/&c. A <NODE> has several parts: 
<USE> ^ 

This tag is used to specify the location to the contents of 
the NODE. If two <DOCUMENTS> share the same 
<NODE>, the <USE> tag is used to specify the location of 
the shared data. 
<NAME> 

This is the name of the node in lhf» originating system. 
The name is primarily for displa- the user. 

<ID> 

This is the unique id of the . J> (see above). 
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<ACCESS> 

These are the access rights for the node. 
<CREATION> 

The <CREATION> tag tells the time of creation and the 

creator for the node. 

<BY> 

This is the creation author. 
<DATE> 

This is the date which the node was created. 
<MODIFIED> 

The <MODIFIED> tag contains information about who 
made the last modification to the NODE, and when the 
modification was made. A set of these can tell history for 
changes. 
<BY> 

This is the userid (or other identifying information) 
of the last person to modify the NODE. 
<DATE> 

This is the date which the node was last modified. 

<DATA> 

The <DATA> tag contains the <NODE>'s low-level data 
(text or a picture, for example). If the <USE> tag is used, 
this should be NULL. 
<runTimeStuff> 

The <runTimeStuff> tag contains information about how 
the <DATA> should be displayed; it is currently the tag 
undergoing the most revision. It is expected that much of 
the information within it, such as font name, will often be 
unusable in the imported-to system. Within the 
<RunTimeStuff> tag, the five tags below are the only ones 
currently defined. The last three will most likely be 
iminterpreted by any system besides HyperCard. 
<FRAME> 

The position of a NODE with respect to its parent^ is 
described by the <FRAME> tag. If the <FRAME> 
tag is absent, then the parent <NODE> is considered 
to be "immediately subsequent" to the previous 
<NODE>. This would be the case for multiple 
<NODE>s in a creamy hypertext system such as 
Notecards or InterMedia. Otherwise, the following 
two tags determine the frame: 
<SIZE> 

This is the size (x,y) of the node. 
<LOCATION> 

This represents the offset (x,y) between the 
parent's origin and the node's origin. If not 



^The parent may be a <C0^1POSlTE> node or null. 
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present, it is undefined and the importing 
system is free to set it arbitrarily. 

<£ontSpec> 

The <fontSpec> contains information about the 

font of the data. 

<NAME> 

This lag contains the name of the font. 
<SIZE> 

This tag contains the point size of the font. 
<STYLE> 

This tag contains any style modifications to 
the font: i.e., bold, italic, underline, &c. 
<JUSTIFY> 

This tag contains the ji'stification rule for the 
text: left, center, or right. 

<lockText> 

This tag is "true" if the user is allowed to modify 
the text of the item, and "false" otherwise. 
<STYLF> 

This tag, probably only interpreted by HyperCard, 
describes the frame for the <NODE>'s <DATA>. 
<originalType> 

This tag, also probably only interpreted by 
HyperCard, contains "button" or "field," depending 
on the original type of the object. 
<ANCHOR> 

There may be several <ANCHOR> tags within a given 
<NODE>. The anchor tags contain information about all 
anchors present within the <NODE>'s <DATA>. 
<NAME> 

This is the name of the anchor in the originating 
system. The name is primarily for display to the 
user. 

<ID> 

This is the unique id of the anchor and must be 
present. 




<CREATION> 

The <CREATION> tag tells the time of creation and 

the creator for the anchor. 

<BY> 

This is the creation author. 
<DATE> 

This is the date which the anchor was 

created. 
<MODIFIED> 

The <MODIFIED> tag contains information about 
who made the last modification to the ANCHOR, 
and when the modification was made. A set of 
these can tell history for changes. 
<BY> 

This is the userid (or other identifying 
information) of the last person to modify the 
ANCHOR. 
<DATE> 

This is the date which the anchor was last 

modified. 
<LOCATION> 

This is the offset in bytes (O is the position before 
the first character) of the anchor text. If the 
<LOCATION> is a pair of numbers separated by a 
comma (or a triple for 3-D space), this describes the 
text span already in the <DATA>. If the 
<LOCATION> is absent, the whole <DATA> is the 
relevant text. 
<TEXT> 

This is the text which the anchor is attached to. If 
the <LOCATION> tag is a single number (i.e., no 
comma) then the text is inserted at that position. 
Otherwise, the text need not be specified. 
<runTimeStuff> 

The <runTimeStuff> tag contains information 
about how the <ANCHOR> should be displayed; it 
is currently undergoing revision. 
<VIEW> 

The <VIEW> tag contains information about 
how the <ANCHOR> could be viewed. This 
also specifies whether the <ANCHOR> is a 
2D or 3D view or either. Right now, this is 
application specific. 
<OBJECT> 

The <OBJECT> tag specifies the objects the 
<ANCHOR> is attached to. This covers 
multiple spans of text, or multiple graphical 
objects. Right now this is application specific. 

<LINK> 




A <LKnJK> holds all the information about a single bidirectional 
link. This may be expanded in the future to describe multi- 
headed and multi-tailed links. 
<NAME> 

This is the name of the link in the originating system. 
The name is primarily for display to the user. 

<ID> 

This is the unique ID of the link itself. 
<sourceNodeId> 

This is the ID of the node associated with the start of the 
link. 

<sourceAnchorId> 

This is the ID of the anchor (within the source NODE) 
from which the link originates. If unspecified, the link is 
from the whole NODE. 
<destinationNodeId> 

This is the ID of the node associated with the end of the 
link. 

<destinationAnchorId> 

This is the ID of the anchor (within the destination 
NODE) to which the link is bound. If unspecified, the link 
destination is the whole NODE. 
<CREATION> 

The <CREATION> tag tells the time of creation and the 

creator for the link. 

<BY> 

This is the creation author. 
<DATE> 

This is the date which the link was created. 
<MODinED> 

The <MODIFIED> tag contains information about who 
made the last modification to the LINK, and when the 
modification was made. A set of these can tell history for 
changes. 
<BY> 

This is the userid (or other identifying information) 
of the last person to modify the LINK. 
<DATE> 

This is the date which the link was last modified. 

<TYPE> 

This is a string which describes the type of link; some 
examples: "Explanation," "Next," "Annotation." 
<COMPOSITE> 

A <COMPOSITE> tag is the framework within which frame- 
based systems such as HyperCard and KMS represent 
cards/frames. It contains an <id>, one or more <NODE>s, and a 
<runTimeStuff>. 
<ID> 

This is the <COMPOSITE>'s unique ID. 
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<runTimeStu££> 

So far, the only <runTimeStuff> defined for a 
<COMPOSITE> is the <FRAME>. 
<FRAME> 

The <FRAME> reprp'^ents the <COMPOSITES>'s 

size and relation to its parent. 

<SIZE> 

This is the size (x,y) of the composite. 
<LOCATION> 

This represents the offset (x,y) between the 
parent's origin and the composite's origin. If 
not present, it is undefined and the 
importing system is free to set it arbitrarily. 

<NODE> 

This is the meat of the composite. See above for a 
description of this data structure. 
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In real-world applications, it is rare that a I'ypertcxt system provides a complete solution. Instead 
the solution normally comes from a comb'nation of a hypertext system with other tools. Thus, as 
Meyrowitz (1987) has argued in his powerful position paper ''The missing link: why we're all 
doing hypertext wrong* \ one of the most desirable attributes of a hypertext system is that it 
should tit easily into its environment, and allow a close interaction with other tools in that 
environment. 

There is now a movement towards standardisation in hypertext systems, in particular a proposal 
that source files for hypertext systems should follow a standard form so that material can be 
interchanged between different systems, llie market forces pushing this standardisation effort are 
obvious, but we must ensure that new standards do not detract from the interaction between 
hypertext systems and olher tools. At an extreme, a standard that made it easy for a hypertext 
system to exchange tiles with other hypertext systems but hard to exchange with anything else 
would be a disaster. 

Do we use text-files? 

Choosing a file format for hypertext systems is similar to choosing a tile format for word- 
processing systems. Indeed many hypertext systems support a good repertoire of word-processing 
operations. Hypertext systems have the added needs of representing hypertext constructs and 
links. Hopefully any standard will encompass all documents, irrespective of whether they are 
created from word-processing or hypertext. For hypermedia systems, similar considerations apply 
to the other media, but this paper concentrates mainly on text. 

A basic choice is whether tiles should be a text-file. By a text-tile we mean a linear sequence of 
text with embedded mark-up but with no embellishments such as tile-headers, associated tables, 
embedded pointers, etc. 

This paper argues the advantages of text-files. The argument is based on experience with the 
UNIX implementation of Guide, which uses a text-file fonnat. Most of the material is concerned 
with nitty-gritty practical experience rather thiin with any underlying theory, but standiu^ds cannot 
ignore these practical aspects. We shall start by emphasising the properties of UNIX Guide thai 
influence its tile fonnat. 

UNIX Guide 

A central aim of the UNIX implementation of the Guide hypertext system is that it should fit well 
into a UNIX environment (Brown, 1989). Indeed it is this facet, more than anything else, thai has 
caused UNIX Guide to be different from the implementation of Guide marketed by Office 
Workstations Ltd (OWL) which runs on Macintoshes and PCs. OWL Guide successfully tits into 
its environment, which is very different from UNIX and has a strong house-style that pervades 
most of the software that runs in that environment. 

UNIX Guide — and henceforth all references to Guide should be taken as UNIX Guide — tries to 
follow the original UNIX 'Small is beautiful* philosophy, though this philosophy has perhaps 
been weakened over the years to the less catchy ^Medium-sized is beautifur. Guide cannot hope 
to provide all the facilities that users may want. Instead it should be good at one thing, hypertext, 
and use other tools to provide functions that they are good at. 

Characteristic features 

Every hypertext system has some characteristic features that set it apart from the herd. In the case 
of Guide there are three such features: UNIX orientation, which we have just discussed, late 
binding and the scroll model. 
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Guide's late binding philosophy is thai fixing of hypertext links should be delayed to the last 
possible moment; this is normally at run-time when the link is selected for the first time. Late 
binding has a number of benefits, arising from the dynamic nature of links. 
The Guide author specifies a link by a symbolic name (e.g. 'Lesser-spotted woodpecker'). If the 
link goes outside the current file a filename is appended to the symbolic name (e.g. '...in /x/y/z'). 
The destination of a link is a Guide 'definition' with the same symbolic mune as the link. When 
links are saved in Guide source files Ihey follow this symbolic form — they are just a sequence of 
characters attached to the button-name that is the source of the link, and only at run-time do they 
cause a link to be forged (by searching for a definition that matches the given name). Late binding 
is therefore a force that makes source files simpler and flatter. 

The third characteristic feature of Guide is its scroll model. A Guide document is a continuous 
scroll, and when buttons are selected they are replaced in-line by the corresponding button- 
replacement, thus causing the scroll to grow and shrink as buttons are selected/deselected. 
Groups of buttons can be combined into laige; units, called enquiries. In Conklin's (1987) 
terminology an enquiry is a region, which is replaced if any button within the region is selected. 
In page-based systems that have a single current page, e.g. HyperCard, the region to be replaced 
is always the whole current page. Enquiries offer more flexibility: in particular, at one extreme 
they can be made to encompass the entire current document. If this is done. Guide, 
notwithstanding its underlying scroll model, can be used to simulate these page-based hypertext 
systems. (See Brown (1990) for a discussion of a large application that lakes advantage of this.) 
At another extreme the region of replacement can be made null: everything remains; if, in 
addition, a button is made to throw its replacement up in a new window (as Guide 'action-buttons 
can be made to do) instead of in place of the original button, then the end result has the flavour of 
NoteCards. Overall, therefore, the scroll model is not fundamentally different from a page-based 
one. 

Nevertheless the scroll model, with in-line replacement the norm, has influenced the source file 
design. For the simplest type of button, which has a fixed replacement that is associated with that 
button and no other, the button-replacement comes immediately after the button-name in the 
Guide source file. This simplest type is button is also generally the commonest, since it is used in 
hierarchical expansions. 

Guide source files 

Having covered Guide's characteristics we can now describe its source file format, and the 
advantages that come from using such a format. 

As we have said, the file foimat is that of a text-file: a sequence of text and graphics with 
embedded mark-up. The maik-up simply shows where Guide constructions (e.g. buttons, 
replacements, enquiries, 'ghosts' — Guide comments) begin and end. All the necessary 
information is carried by this mark-up: there is no file-header and there are no associated tables, 
etc. 

The mark-up follows the format of rr off requeats. For example, a button-name 'Lesser-spotted 
woodpecker' would be represented as 

, Bu hutton-atrribures 
Lesser-spotted woodpecker 
.bU 

Thus the Bu and bU requests mark the beginning and end of a button name, and the Bu request has 
as its argument a description of the button's attributes. (For better or for worse, attributes do not 
figure strongly in Guide and the Bu request is, in fact, one of the few Guide requests that has 
attributes.) 
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The purpose of this paper is not, of course, to propose frr)jy format as a standard. As far as Guide 
itself is concerned it would be equally easy to replace Uie trojf ^ynldx with any other syntax that 
had mark-up embedded in the text, e.g. our previous example could have been in the SGML (ISO, 
1986) form: 

< Button ... > Lesser-spotted woodpecker <\Butlon > 

However, given the need to use other UNIX tools, the use of nvff syntax, which is a UNIX 
standard, has certain advantages. For example: 

spelly the UNIX spelling checker, can be used on Guide tiles without any adjustment. 
(It automatically strips off rrr?j5^ mark-up by using the ^m>j5^ utility.) 

• if Guide tiles are to be formatted and printed on paper, troff can do the job. For 
example the Bu request can be made a macro which, inter alia, switches to bold-face so 
that button-names come out in bold. ('(Tie names of Guide requests have been 
deliberately chosen not to clash with other rwjj^ requests.) 

These UNIX-dependent advantages of Guide's mark-up should not, however, be over- 
emphasized, and if SGML-based ic^* had been readily available SGML format would have been 
a better choice. 

Readability 

The majority of Guide users are unaware of how its source tiles are stored. However some authors 
do need to look at or to generate source tiles, imd for them it is a huge advantage that the tiles are 
fairly readily understood by humans. Indeed the very first Guide implementation (1984^5) had a 
file format involving esoteric binary corses, and perhaps the greatest step forward in Guide's 
development has been the banishing of this mumbo-jumbo. Sample benefits of the readable form 
are: 

• it can be edited using specialist editors. Although Guide offers editing, this is not its 
forte; elaborate editing, e.g. global replacement of a pattern, can be done by a tool that 
is specially designed for such tasks. 

• it makes conversion programs easier to write and debug, a point we discuss later. 
Other media 

Although this paper concentrates on text, since we believe it will predominate in most hypertext 
applications for the foreseeable future, it is not sensible to ignore other media. They can be either: 

(a) stored in separate files, whose names are referenced in the main text-tile. These 
separate files would hopefully be represented in the appropriate standard form for 
the media. 

or (b) embedded in the fonn of commciits in the text-file. Often the content of these 
comments will appear as arbitrary binary codes, simitized if it is necessary to avoid 
^difficult' codes such as end-of-file and end-of-line. 

UNIX Guide offers both. If the second approach is used a bit-map i)icture is represented as: 



.Pi 

A" byres representing binary encoding 
.\" bytes representing binary encoding 



.pi 

Each line of the binary encoding is made to appear as a r/v^^ comment. This is important, as it 
causes utilities such as spell to ignore these lines; otherwise there could be spurious reports of 
spelling errors. 

In order to create the encoding of a picture. Guide has to capture the raw picture in the lirst place. 
(The raw picture will typically have come from a drawing program or a scanner.) Like most other 
softwiu-e. Guide tries to avoid input modes ('This is a picture'. This is a text file'). Input modes 
can be avoided if tiles have a type ass^Kiated with them. UNIX has a somewhat basic — unkind 
people would say crude — mechanism for attaching a data type to a tile. This is the 'magic 
mimber'. It helps Guide avoid input modes though it becomes difficult if material comes in 
through a pipe rather than direct from a lile. Overall a standard could not assume that ever>' file 
system provides a satisfactory mechanism for attaching a data type of a file. Her.ce if source files 
are represented in a wide variety of forms, corresponding to different media standards, the user 
will sometimes be forced into the use of different input modes. 

Aims of standards 

It is worth pausing at this point to consider the purpose of hypertext standards. Three important 
aims of hypertext slimdards should be: 

(1) to allow import/export of documents, or more generally to allow sharing of documents 
with other software. 

(2) to allow exchange of dcKw.nents with other hypertext systems. 

(3) to allow existing tools to be applied to standu-d dtKunients. 

The last of these is often overlooked, but if there arc no tools asswiated with a standard the 
standard will be a standard that no-one uses — a bitter lesson that many have learned. In most 
environments (and especially in UNIX) the vast majority of existing tools use a linear textual 
fomnt. This may be a sad commentary' on the state of the world, but it is the reality. Hence 
choice of a text-file format as a standard has big advantages. 

One can argue on the relative imporiance of ( 1 ) to (3) above. Fcrsonaily we rate ( 1) and (3) equal, 
with (2) far behind. We shall nov/ discuss ( I ) further. 

There iire two sub-cases of (1). Firstly there is the import/expert case where material produced by 
another tool is converted to hypertext form or the hypcriext fomi is converted for use by ;mother 
tool. The other tool may be a word-processor, a database, a programming language compiler, a 
drawing tool, etc. Secondly vhere is tiie Utopia which the standard envisages: all material shaies 
the same formal and no conversion is necessary -- though several problems remain, as we shall 
see later. 

Conversion may be done in advance or on-the-fly. The latter is. '"course, preferred if conversion 
is a fast process, since it docs not involve keeping two separate documents up to date. Conversion 
is normally a dreary and ujisatisfactor)' process, but there ore tliree way in which the hypertext 
tile format can help: 

• a simple textual fomiat facilitates conversion. 



it helps if hierarchical buttons have their replacement immediately following. For 
example it then requires only a trivial effort, when converting a v/ord-processor file, to 
map section headings into button-names and the body of the section into the button's 
replacement. 

• a format that is readable by humans aids the debuF ling of conversion utilities. (Sadly, 
conversion utilities, because of their ad hoc natujo, tend to take a long time to debug. 
Each new source dociunent brings a new crop of problems.) 

Pipes 

If conversion is performed on-the-Hy the UNIX pipe — now available, in one form or another, in 
most operatmg systems — is a convenient way for transferring data. Hence Guide is frequently 
used as a component of a pipe. 

Following the general UNIX philosophy Guide does not know or care whether its input comes 
from a source tile or a pipe and the same format applies to both. 

In this environment the following characteristics of source tiles have proved valuable: 

source files i\rc text-files — again this advantage comes first: most piping mechanisms 
are based on the stream-of-characters model. 

• a text-file containing no mark-up at all is a valid source file. Such material (e.g. the 
whole or part of existing non -structured files) is commonly used in building Guide 
documents and does not, therefore, require a special input mode. 

• a concatenation of source files is a valid source file. Moreover a source file can be 
included within another. Thus a utility such as the C pre-processor can be used to 
build the Guide input from a combination of existing source files. (The ;e may, indeed, 
be parameter ised using pre-processor statements such as define and ifdef.) 

Newlines 

A small issue of some importance is the treatment of newline characters, md in particular whether 
they should be hard or soft, '^ince newlines are hard in ordinary' text files. Guide generally treats 
newlines as hard. However a newline that precedes a Guide request is ignored. (A newline 
preceded by a null Guide request therefore acts as a soft newline. When Guide saves a file it 
inserts a soft newline if an output line is getting too long — very long lines kncck out many 
UNIX utilities.) Obviously, when material is imported or exported, soft newlines and other soft 
mark-up needs to be stripped out before transmission. 

Dynamic interchunge 

Ideally a hypertext systeni should support a dynamic interaction with its environment. Thus data 
should be shared with other programs while the hypertext system is running. It is natiual that the 
source file fonnat applies to such data as well as to data that is pre-stored in source files. Di Guide, 
the selection of a button can cause a program to he run, and the output from that program serves as 
the replacement of the button. This output follows the nomial Guide source format; usually it is a 
sequence of ASCII characters w ithout any mark-up. Si)meiimes, however, the output may involve 
hypertext structure: for example in one application, a button launches a program that is a retrieval 
system. Tlie program searches for a given term and converts the hit list into a hypertext structure 
that makes it easy for the user to examine the hits that seem most relevant. This stnicture is duly 
displayed by the hypertext system, In :uiothcr application a button runs a program to produce a 
report o. items currently in stock, and this output is produced in a hierarchical hypertext format. 
The issue of standardization also affects the programs that are executed within hypertext systems. 
Most systems contam their own programming language, and in HyperCard this is a major part of 
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the system. However experience suggests it would be hopeless to expect every hypertext system 
to abandon its cunent programming language and adopt a new standard one. 



Saving 

The '.save" operation from a hypertext system may involve: 

(1) saving what is seen. 

(2) saving what is seen, together with the hypertext structure behind it. 

It is (2) that interests us here, since it creates a hypertext source tile. This output tile need not 
relate directly to a single input tile: at one extreme it could have resulted from loading several 
input tiles and editing them; at the other, the material saved could be a small fragment of an 
original input file. 

Cut-and-paste. when used to cut from the hypertext .system, is a special case of saving. Ideally 
both (1) and (2) above should be offered, though Guide currently only offers (1). Case (2) is 
useful if the material is to be pasted back into a hypertext docimient. 
Saving may go directly to a tile or into an output pipe. 

Saving presents no problem if source tiles use a text-tile fontiat. If the source format involves 
tile-headers or the like, it requires more thought and perhaps more u.ser action, particularly if the 
original input came from diverse sources. 

Sharing tiles 

Earlier in this paper wc wandered in the anarchical world of conversion programs; it is now time 
to move on to the relatively Utopian idea of sharing infonnation so that an identical tile can be 
processed by nuuiy different systems. 

Let us assume that two programs X luid Y share the same file. (X and Y may be different 
hypertext systems or one or other of them may be. say, a word-processing system.) A user of X 
irlay load the lile. edit it and then save it. Clearly the file should still be usable by Y. 
This apparently simple requirement requires care. Inevitably there will be some operations Y can 
do, but X cannot. Assume for example that Y can display text in different point-sizes but X 
cannot. If a tile contains mark-up indicating a change of point-size X must preserve this 
infonnation when a tile containing point-si/.e changes is loaded into X, edited and sub.sequently 
saved. As a greater challenge. X must behave sensibly when editing involves material that 
contains point-size changes: what happens if half of a string in a large point-size is copied, and the 
instruction to increase the point-size is copied but the corresponding instruction to set it back is 
not copied? 

Guide currently makes an attempt to deal with these issues. It has an experimental system for 
sharing files with troff'. If a rmff file is loaded into Guide. Guide tries to take account of mark-up 
it can'^handle. e.g. new paragraphs; other mark-up. such as change of point-size, is ignored. 
However all the original mark-up is loaded into Guide in the form of 'ghosts' — comments 
that are only visible to Guide authors, not to Guide readers. V/hen a Guide file is saved, the.se 
ghost:^ are converted back to the original rmff rmuk-up. thus re-creating the original file. Given 
mat Guide authors can see these ghosts, they will, hopefully, be aware of the implications of the 
fnark-uj) when tliey perform edits. 

On the other side ot ihe shiu-ing. when rr(;//is using the file, there are fewer problems, not least 
because troff has no save operation. It is, in this situation, a happy property of rro^ that it 
completely ignores requests it cannot recognise; thus Guide mark-up is ignored. 
Overall the cunem Guide shaiing system just about works, but could profitably be replaced by 
something built on sounder foundations, 
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Errors 

If source files may be generated by conversion tools, editors, etc.. they may well contain errors. 
The design of source files should therefore contain enough redundancy for such errors to be 
detected. The design should also bear in mind that, on detecting an error, the hypertext system 
should have sufficient information to give a decent error message and stop gracefully, retaining as 
much of the source file as possible. 

Abstractions and discipline 

The focus of this paper has largely been on the present rather nasty world. Ideally standards 
should look to the future as well as covering the present. 

Current usage of Guide (and doubtless of other hypertext systems too) has shown up two 
deficiencies: 

(1 ) a need for higher level abstractions than links, which are gotos. 

(2) a need fo. each application to evolve a hypertext house-style and to impose this. 

The two needs are related: many aspects of a house-style can be imposed by designing some 
special abstractions and then ensuring that authors use only those abstractions. This is similar to 
the way that document standards such as ODA (ISO. 1988) and SGML impose a general 
document architecture. 

llie ICL Locator project (Meehan. 1987; Brown. 1990). one of the bigeest current Guide 
applications, has successfully tackled (1) and (2) by producing a preprocessing' tool for Guide that 
helps (and constrains) authors to produce the required Locator style. However preprocessors are 
no* always the answer for the same reason that preprocessors to compilers for programming 
languages are not always the answer, In the latter case the program author, when 
maintaining/debugging a program, usually needs to be aware of its intennediate form and thus the 
power of the abstraction that the preprocessor provides is lost. 

Experience also shows that some environments want discipline and some wm freedom. Thus 
heavyweight mechanisms that affect everybody need to be avoided. 

Overall, therefore, it is desirable that source file formats contain facilities for defining or imposing 
abstractions, but these should be optional. It should still be possible for draconian managements 
to enforce their requirements; for example, currently some managements do not release the real 
Guide to their authors, but equate 'Guide' to a UNIX shell-script which loads the real Guide with 
certain options already pre-set. and perhaps with some of the items in Guide's normal menu either 
suppressed or replaced. (Guide options are, incidentally, mostly controlled by UNIX environment 
variables and switches; some could profitably be controlled by mark-up within source files, but 
currently this is not supported.) 

Size of file 

The design of source file fonnats is somewhat influenced by the size of a typical file: is it a single 
'page' or could a whole encyclopedia be stored in a single file, In practice Guide authors vary 
considerably: some have tiny files and some have files containing megabytes of text. In tlie latter 
case there is a significant pause while the file is loaded but thereafter speed is superb. 
Typically the initial screen consists of a summary, which consists of a skeleton document with 
buttons representing the components of the document. Initially no buttons are expanded. 
However Guide's source file format, where normally the replacement of a button immediately 
follows the button-name, means that the whole source file needs to be loaded in order to paint the 
initial screen. Indeed because of this Guide always loads complete source files, making no effort 
to restrict itself to the parts that iu-e actually needed. In the environment where Guide runs. 
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workstations with a lot of real storage, supplemented by virtual storage, this has caused no 
problems. However OWL's Guide, which can nin in much more constrained environments than 
UNIX Guide, has adopted a file format that does allow piirts of the hies to be loaded. OWL uses a 
structured lile fonnat where associated tables designate where constructions begin and end. 

Conversion between hypertext systems 

Although UNIX Guide and OWL Guide have identical parentage imd simihir hypertext 
mechanisms, it would be a major job to convert source tiles between the two. This is not because 
tile formats are different, but because there are signilicant differences in the way Imking is done 
(e.g. UNIX Guide's late binding approach is not found in OWL Guide). 

A conversion has never been attempted but. if it were, it would be a similar exercise to converting 
between two somewhat similar programming languages; you may get an automatic tool to convert 
90'/.. of a program, but the rest would need doing by himd. Even within the 90% that was 
automatically converted, there would be odd differences in program behaviour. 
A complete conversion between two radically different hypertext systems would clearly be harder 
still It i.s not the source file formal that is the problem, but fundamental ditlerences m approach. 
This is why we believe that this is the area where standard tile formats have least to offer. There 
is of course, the possibility of a deeper standard which specifies how hypertext systems actually 
work. In practice there is. however, no more chance of getting creators of hypertext systems to 
agree than getting designers of, say. programming languages to agree. 

Conclusions 

The tone of this paper has been at least lukewarm about standards. 

Nevertheless UNIX Guide can hiu-dly claim to be a major force that will materially affect that 
standardisation movement, and hence standards may come. If they do come wc hope they: 

are geared to exchange with other software (word-prtKessors. picture drawing 
programs, databases, etc) rather than specifically with other hypertext systems, 
are geared to taking advantage of existing tools. 

are based on ASCII tiles that can be read, edited, etc, by humans, and can be sensibly 
transmitted down pipes and similar mechanisms. 

can treat straight text hies as a subset of hypertext files, rather than as special cases, 
are not based on a specihc linking mechanism. If late binding is used, the linking 
mechanism is not very relevant to source formats. 

allow flexibility in the region of replacement so Guide enquiries imd their equivalents 
in other systems can be supported. 

cater for higher-level user-defined af^stractions and house-styles. 
. allow other software to share hyperleM files without the need for conversion 
problems. 
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1. Introduction 

Hypertext literature tends understandably to concentrate on what is new and to ignore, or take for granted, 
the properties of hypertext that arc also present in paper documents. The purpose of this paper is to 
consider how the experUse that exists in standards and models for paper documents can be used to save 
effort when designing a standard for hypertext, and how to make hypertext and paper document standards 
compatible. SecUon 2 discusses some relevant similariUes between paper and hypertext documents. 
Section 3 introduces relcvent aspects of the Office Document Architecture (ODA) [1] and suggests ways to 
build on ODA to create a standard that combines the strengths of the two areas. 

2. Similarities between paper and hypertext documents 
2.1. The need to separate the logical structure and its presentation 

Although hypertext systems vary widely in appearance and funcUonalily they generaUy have similar 
underlying document structures - directed graphs in which the nodes hold the content and the arcs 
represent links of various types. The way in which the nodes and links are presented on the scree" and 
what happens when a link of a particular type is acUvated, are peculiar to (and usually hardwired into) the 
hypertext system. 

If a standard for hypertext is to be effective, it must allow a hypertext to be created on one system and 
presented on another. In particular it must allow for the possibility that the receiving system does not have 
the capability to perform the presentation as intended on the original system. To do this it should represent 
separately: 

(i) the components in the underlying logical structure; 

(ii) the specification of presentaUon faciliUes on each participating system (including dynamic properties 
such as the actions allowed when hotspots are selected); 

(iii) a mapping from (i) to the relevant set in (ii) for each participating system. 

This separation of the logical structure from the method of presentaUon is not just an inconvenience needed 
for portability; it is a positive feature that can be used to give hypertext some of the advantages that were 
given to paper documents by generic markup and structured editors. 

Maikup of documents intended for paper used to be, and in many cases still is, presentation oriented. 
Formatting commands are inserted into the document to request explicit presentaUon features such as 
moving the current print position or changing to a given font style or size. Genenc markup on the other 
hand, is concerned wiUi the logical sUiiclure of the document - it marks portions of the content as 
belonging to parUcular named classes. The actual layout and presentaUon are bound to Uie name later 
(either by the publisher, using tradiUonal markup, or by a computer fonnatUng system). Genenc markup is 
essentially for non-intcracuve systems. The intcracUve equivalent is the stfucturcd document editor, which 
works in a similar manner by assigning a named class to each document consUtuent and providing separate 
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'style sheets' to specify the presentation of constituents belonging to the class. The appearance of all 
corjslituents belonging to the class can be changed by altering the style sheet. 

In both cases the effect is to separate low-level prescnuition details from the logical document structure and 
content (as m (i) and (u) above) and U) allow or provide a means of binding the two together at a later sUige 
I nis late bmdmg corresponds to the mappings in (iii) above. 

In the logical structure of the document the named classes should correspond 'o the function of the content 
rather than the method of its presentation ('tide* or 'reference' rather than 'change to bold type' for 
example). Generic markup and structured editing arc acknowledged (see 12] for example) to have many 
advantages mcluding: ' 

making it easier to present the document in anollier .style (iliai of a different publi.sh(^r, for example) 
wiiliout extensive manual changes to the text - iliis is the paper ctjuivalent of presenUng a hypertext 
on a different system. 

helping to maintain a coasistent style throughout the document, iuid making it easier to enforce a 
hou.se style. 

improving typographic quality by discouraging authors from dabbling in low level details and 
leaving the design of styles to experts 

forcing the author to consider the f^lructure of the document. This usually results in a belter structure 
— and could be particularly important for hypertexts. 

Where layout and pre.senuuion facilities are complex, this .separation of die logical and presentation aspects 
01 the document often results in considerable factorisation of information and consequently in reduced costs 
for iran.smiiting a document. 

2.2. Links 

Paper documents have links - intra-documeni links to components of die logical structure (".see .section 
3.5 ) or to part of a particular representation ("sec page 27"), and inter-document links (bibliographic 
relerences). Each link (in a well-written document) is accompiinied bv some indication of what the reader 
can expect to find at the odier end, or at least die reason the author has for directing the reader there 
Hypertext differs only in diat, instead of indicating the position ("page 27") of the remote object, it offers 
-some means of automatically accessing and presenting die remote object. 

If a system is to be able to edit or reformat a paper dcxument and still retain die integrity of its links, then 
each link must be represented at the logical level in much die same way as it would be in a hypertext. It 
might, for example, have a type, a reference to die identifier of a remote object and, associated with the 
type, a specification for how it is to be presented. 

2.3. Hierarchical structures 

In paper documents the logical components referred to above are typically arranged in a hierarchical tree- 
like .structure. A book, for example, might contain chapters which conUiin .sections which contain 
paragraphs. This structure i.s primarily a tree hut it may be supplemented by link components that cut 
acro.ss the normal tree links mid turn die structure inio a directed graph. 

Aldiough hypertext systems emphasi.se the links more than paper documents, their underlying models are 
.similar. Indeed, several hypertext sysleiivs rccommcMid or enforce a general hierarchical model to minimise 
the well-known problem of readers Ixjconiing lost [3,4]. 

To represent a hypertext within the, hierarchical model lor paper documents, we could suin by a.ssuming 
that die logical structure components referred U) above might simply Ix; the links and nodes of the 
document In this case each node would be very simple, consisting of a single piece of basic information 
togedier with hierarchical and non-hierarchical link.s. 71ie hierarchical links would form die ba.sic tree 
structure, and die non-hierarchical links would be the link comjx)nents. 

Most hypertexts could not be represented by sucn a simple structure, however, and there is a need for 
mternal structure for a node. A finer giariulariiy is needed, in which each node is structured hierarchically 
into a number of subordinate components (including links) representing paragraphs, partii of paragraphs, 
diagrams, buttons, hotspots and such like. The hypeitext node dius ba-omes a subtree and diis makes it 
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possible 10 represent the node in a way very similar to thai in which wc represent a page of a paper 
document (although in some cases the 'page' might be so large that it needs to be scrolled). Rules for 
laying out and presenUng the components of the node could then be specified in the way they are specified 
for a page of a paper document. 

2.4. Style and the problem of getting lost 

As shown above a single node of a hypertext is similar in many respects to a page or logical sccUon of a 
paper document, and it has long been recognised that the meaning of a page of informaUon — and Uie ease 
with which this meaning is understood - is very dependent on the skill with which the page is laid out. 
Those unskilled in th art of typography are well advised to leave the design of the document styles to 
experts For a hypertext, style would include the posiUoning and prcsentaUon of different types of button or 
hotspot as well as text and diagrams. The structures described above would allow all the sophisucauon 
used for laying out a page of a paper document to be applied equally to laying out a node of a hypertext. 
Early applications of the standard will probably be in automatic translators between existing hypertexts and 
the standard, in which case the separate logical structure and the late binding will iniUally be hidden from 
end users. U would be wise however to ensure that the standard allows for future improvements in 
hypertext A reasonable assumption is that hypertext systems could learn design techniques from paper 
document processing systems, including the principles inherent in generic markup, in order lo gain the 
advantages listed above and especially to help authors to improve the styles of their hypertexts. 
Well defined and consistent styles have a bearing on the problem of getUng lost in hyperspace [4], the 
solution to which has often been considered to be a matter of giving the user a suitable overall graphic view 
(or map) of all or part of the document. There is reason to believe that this may not be the only or even the 
best method [5] and that perhaps good authorship may make it unnecessary for the user (including authors. ) 
to be aware of the underlying directed graph. Well-designed generic styles could be a way of helping users 
with this problem. 

2.5. Compatibility between paper and hypertext documents 

It would be foolish to ignore the need to produce a paper version of part of a hypertext, and it also seems 
sensible to make provision for readers to have the advantages of hypertext navigation when viewing a 
document on the screen - even if the document is eventually intended to be read from paper^ These aims 
could best be achieved by having a common underlying representaUon for the structures of both types ol 
document, together with well designed ways of mapping those structures onto different fonns ot 
representation. It is not suggested, of course, that a document designed for paper would neces.sarily make a 
good hypertext or vice versa, only that a usable representation should be readily available by applying 
different presentation styles. 

3. A hypertext standard based on ODA? 

ODA is a standard for the storage and interchange of complex multimedia documents. The ODA document 
model is hierarchical and object-oriented. It caters for both source (processable) documents and output 
(formatted) documents. Currently ODA documents can contain three types of content (character, raster 
graphics and geometric graphics) but other types of content will soon be added. 

Several major extensions to ODA arc already under consideration in the relevant committees and working 
groups These include tabular layout, video material, the inclusion of data in documents — and hypertext. 
The SGML [6] community is also starling to consider hypertext extensions. It would be tragic if three 
separate hypertext standards emerged: one based on ODA, one based on SGML, and ^ completely scparaW 
one from the hypertext community. After several years of rivalry and backbiUng the ODA and SGML 
committees are showing encouraging signs of working together, so there is some hope that these two may 
merge. 

The details and suggestions given below are based on ODA, largely because ODA currently includes 
graphics and images and defines a layout process to map from the logical structure of«Je document to a 
formatted form. However, the general principles could apply to SGML v hen used with DSSSL [7] which 
defines a presentation model for SGML documents. 

-61- 57 



The following subsccUons describe the features currently in ODA that make it useful as a basis for a 
hypertext standard and then the features that we believe must be added. These new features are needed to 
improve the ability of ODA to represent all the features of high quality paper documents, but are also 
intended to prepare the way for Uie hypertext extensions to ODA. 

3.1. What ODA can already offer to hypertext 

The following secUons give a brief description of ODA as it applies to paper documents. 

3.1.1. ODA Document Architecture 

ODA provides a tree-like model of a document. The siruaure of the document is given by the sh-dpc. of the 
tree, while the content is stored entirely in the leai" objects. Attributes provide information about the 
objects. A few of the most important attributes are introduced in the examples and discussion below. Only 
one needs to be menUoned at this stage. This is the content architecture attribute that defines the type of 
content for each leaf object and thus allows different types of content to co-exist within Uie d.xument. 
An ODA document is described by two structures. The logical structure divides and subdivides the content 
of Uie document into logical objects that mean something to the human author or reader. A logical object 
may be a general item like a sccUon, title, paragraph or reference. Alternatively it may be a specialised 
Item like a telephone number or price, or a collection of relatctl information like a list of companies selling 
a particular product. Only the lowest level objecUs, such as Udes or prices, have content. 
The layout structure is concerned with a visible representation of the content. It divides and subdivides the 
content into page sets, pages, and rectangular areas within pages. Rectangular areas with nested areas 
defmed within them are known as frames. The lowest level areas are known as hloch and, by definition, are 
the only areas to have content associated with them. A frame might be used to represent a column of text, 
for example, with nested blocks for the content of individual paragraphs. 

Each document has its own specific logical a..d .specific layout structure, but their creation is guided and 
controlled by generic document structures for that parUcuhir class or 'style' of document. These are sets of 
object type definitions (one set for logical objects and one fur layou'. objects) that specify the types and 
combinations of objects allowed. In ODA terminology the definitions constitute the generic logical and 
generic layout structures for a document class. 

3.1.2. Example.s of ODA Structures 

This section illustrates the structures introduced alwve by presenting snippets of iht; generic structures that 
might be used for a journal containing technical papers. It also introduces a few important attributes. 
The generic definition for each non-leaf object has an attribute called generator for subordinates Uiai 
describes how tlie object may be made up from subordinate objects. These indicate that subordinate objects 
may be optional (OPT), required (REQ), repealed (REP), or optional and repeated (OFl^ REP), and that a 
group of objects may occur in a given sequence order (SEQ), in any order (AGO), or as a choice where 
only one of the group occurs (CHO). The information given in ihe.se attributes provides a simple griinimar 
for the primary structure of the document class. 

Figure 1 shows the generic logical .structure for a single technical paper in the journal. It indicates that the 
paper consists of a compulsory tide, followed by a compulsory author's name, followed by an optional 
abstract, followed by one or more sections. If the abstract is present it consists of a single paragraph. Each 
section begins with a subtitle. The 'REP CHO' construct indicates thai Uie subtitle is followed by a series 
of paragraphs or lists occurring in any order. Lists consist of one or more list items. (In pracUce, a mure 
complex structure catering for items like footnotes and diagrams would be needed.) 
The corresponding generic layout structui-c might define one page style lor Uic first page of ihe paper, and a 
different style for all subsequent pages. Figure 2 .shows the lop level of such a sirucUirc. 1 he 'Title page' 
contains a 'Header frame' representing an area set aside for Uie title, author's name and abstract, and a 
*Body fnune' for die start of the first section. The 'Continuaiion pages' conuiin 'Continuation hxly frames' 
to hold the rest of the .sections. (Again, in praciice, lurlhet frames would l>e needed for items like running 
titles.) Blocks arc not includetl in the generic layout structure but are assigned to pages and frames during 
the layout process as oudined below. 
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Figure 1 : Generic logical structure 
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Figure 2: Generic layout structure 

ODA's layout process decides exactly where each item of the document is to be placed. It uses the specific 
logical structure, the generic structures, and the content architectures to create the specific layout structure. 
It works at two levels 

Content layout takes portions of content and lays them out into blocks. This stage is dependent on 
the content architectures involved and on sets of attributes known as presentation styles. 
Document layout places blocks in frames or pages. This stpgc is dependent on sets of attributes 
known as layout styles. 

The content layout process thus deals with character sets and the fine positioning of items within blocks, 
while the higher level document layout process decides how to place the blocks vithin pages and frames. 
The document layout process is guided by three attributes whose values are shown in italics in Figures 1 
and 2. layout object class is normally used to indicate that a major logical division of the document should 
be directed into a parUcular page or page set. In the example the logical 'Paper' has its layout object class 
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Within a layout object class, the attributes layout category and permitted categories can be used to direct 
ogical objccl. mto different frames. If a leaf logical object is given a layout category name, it can only be 
imd out m a frame that has the same name as one of its permitted categories. In the example the only 
category names used are 'Head' and 'Body'. When the layout prcJess tries to place t^e blS 
correspondmg to the title, author's name, and abstract (if present), it will look for a frame with 'Head' as a 
permitted category, and will therefore create a 'Title page' and place them in U,e 'Header frame'. But when 
It reaches the blocks corresponding to the contents of Uie sections it looks for frames with 'Body' as a 
permitted category, so it uses the 'Body frame' until that is full and then creates 'ContinuaUon pages' as 
necessary in order to use the 'ConUnuation body frames'. 

TrUZtl "'''^' ^""tcnt with pages, frames 

and blocks. The two specific structures are related and come together at the level of the content Figure 3 
shows a fragment of the specific structures for the beginning of a paper. It assumes the paper has no 
abstract and that thc^irst section begins with three paragraphs, only one of which fits onto the Utlt page 
Figure 3 shows a neat one-to-one correspondence between logical objects and layout objects. This often 
occurs, but not always. Logical content portions may. for example, be split between blocks (when 
paragraphs are split over pages) or concatenated into paragraphs occupying a single block 
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3.U. Providing Different Views of an ODA Document 

Tlie previous section gave only a brief .skcteh of the ODA layout process, but it should be sufficient to show 
that the appearance of a spccifK ,cal document can be altered by judicious chtinges to its generic layout 
structure. As a simple cxampk . ocleting the 'Buly frame' from the 'Title page' in Figure 2 would cause 
each paper to be laid out with only the title, author's name and abstract on the first page. Hicrc would be 
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no frame on the first page with 'Body' a-s a permitted category, so the first section would have to start on a 
new page in a 'Continuation body frame' . 

More radical changes to i. z layout can be achieved by altering the attributes that make up the layout and 
«Uon ^Ts^. attributes in these styles apply to logical objects, but the objects contam only die 
^e appropriate style. The styles themselves are held separately. T^is provides a more conc.se 
Tulnt represcntaUon and allows the styles to be changed without changmg the logical structures. 
The layout styles include the layout object class and layout category attributes (d(^ribcd Uie P/™"^ 
section) and other attributes governing the selection of frames and the positioning of blocks w.lhm a frarne 
^sa JZToljea attribute, for example, constrains the block containing the logical object to share t^e 
rrfleTL block containing another specified object, while ne. layout object ^^^^^^'^^If^ 
contninin^i the object to start a new frame. Offset and separation control the minimum spacing between 
a"l^ks^^^^^^ POsiUon of blocks is dictated by fill order which allows normal top-to- 

bottom positioning or traditional footnote positioning. 

The oresentauon styles guide the lower-level content layout process and thus affect the appearance of 
Content STnS^^^^ They contain different attributes for different content architectures For 

cSL"":^^^^^^ they include attributes affecting the indentaUon of the first line, the distance 
between lines, and the initial font size. 

Changing the generic layout structure and the styles can lead to significantly different views of the siunc 
S document. Page and margin sizes can v^y. single or double -'^ VT^ferTt 'L^lt'y^^ 
nara'rraoh soacing and font size can change. In particular, it is possible to cater for different house styles 
Tv hirmcTnrrd to pmv de different styles for interactive ediUng and the final printed version. ODA .s 
nL a™eS^ be in this resjict because it has insufficiem separaUon between the logical and 
layout structures. We are aueinpUng to get this changed (sec below). 



3,2, What ODA still lacks 



yvnai V^U/\ ami mviva 

The structures and styles inuoduced above form a good basis for a flexible standard for paper documenu_ 

t^lVLi some of the requirements for a hyf^rtext standard We have •^enti^^^^^^^^^^ 
deficiencies in the ODA standard and have inves.igatai changes to the ^^^.^'^ J^^^^^^^ 
The changes are needed in order to improve the representation of paper ^ocumcn^ but were designed 
the aim of orenaring the way for an extension of ODA to deal with hypertext. ISO/IEC JTCl/SC 18/SWG 
(^e L Lr^^^^^^ for changes to the standard) has already declared its intention to 

tdoTsu™ Welve explained the deficiencies and our suggestions jro„ 
paper [8] that is to be considered by the special working group in January 1990. Brief ouUines ot the 
deficiencies for which we have offered cures are given below. 

3.2.1, Separating logical structure from presentation 

One of the suengths of ODA is its attempted separaUon of the logical ana layout suuctures. but this does 
noi?o f^eS so^^^ have made suggesUons to u.ake it complete. If it is required to change the style of 
jruLt^^^^^^^^^^^^^^ of a diiferent company or different publisher, for example) U should not be 

n^rs^ to ed t Ltgica^ only to apply a different set of layout and presentation sty es to create 

nSt 'view" of t^e same logical documem. This facility to change the view ;;f;JhO"t^haXg 
dement is part of the answer to the problem of exchanging hypertexts between differem systems that 
have different presentaUon capabilities or different presentation conventions. 

3.2.2, Comprehensive attribute inheritance 

The ODA mechanism for i^heriUng byou. and pre.en.aUon ™">'*%'"/P^X^;^irort fZZ he 
r.nM,u, rt/.f.,„li v iliiK is insufficiem If an attribiilc value is not specified for Ihe object or ils class men tne 

a, feat o , ttah;ritS "Wording to the objecfs psition in .he Uee and no. ---J-^^ - ^j; 
(d anler list etc ) Our suggestion for supplying this facili.y is Uie addition of 'style tables as deser bcd ra 
E t o Wief enables the style inherited by an objee. (and therefore .he way i. . ormaUed^U. 

lenonil b.,.h on iu class and on it^ posi.ion in Uic documen.. This mechanism is valuable for hypertext 
^p'^lln mlng it Possible tu feUnguish obJec. of die same type dia. are in different states (open 
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iH^t fZi^Tu f°^''^''"P'^^ ^ «^'«"dcd so that, it can specify changes of stale (such as those 

tijat take place when a hotspof js selected) by changing the style table. 

3.23. Links 

In both paper and hypertext views a document designer must be able to specify the purpose of each link, 
and to specify how the layout process can express that purpose. In this respect the requirements for links 
are veiy similar to those for logical objcets. so it seems reasonable to deal with them in the same way - by 
having classes for links. The class of the link should determine how and where in the document the link 
c£m be used and u ;nust be possible to specify the representation of the link in a way that depends on both 
the class of the link and also Its posiUon in the document. 

I^ialrr T ' > T^' '° °f objects allows a document 

designer to use those logical objects as links, with all the functionality described above. These addiUons do 
not m any way change exisUng definitions or change the validity of existing documents. 

3.2.4. Selective and multiple presentation 

ODA does not have a mechani.sm for specifying that a logical object should be ignored in the layout 

? '\ ^ '^"^ ^ ^^'"y ^ ^^i^"" '^""•d. for example, allow 

«no^ H rr ' T^'' ' ^""'^^""^ ^'^^out those annotaUons appearing in a printout, or could 

allow different versions of the document to be produced for different situations. To achieve this we have 
suggested a simple variation on the style table mechanism described above. This facility is obviously 
needed for hypertext because most of a hypertext is not presented at all unUI selected by the u.s . 

3 J. Extensions and Interactive Documents 

This section shows how Uie proposed extensions can be applied to screen based documents and hypertext in 
general and then looks in more detail at how they can be applied to two parUcular hypertext system^ 
ODA allows a measure of flexibility in the layout and presentation of documents, but different views are 
not a substitute for proper interactive facilities. The basic problem is that the ODA layout process is 
sequential and page based - and several attributes reflect this. Any fomi of online editing requires 
ex ensions to the layout process to make it incremental and to allow the user to scroll around the document 
but some more ambitious features desirable for screen-based documents are 

(i) An outline facility ~ to display selected (usually high level) items, such as chapter and section 
headings, and ignore other items. 

(ii) Pop-up displays ~ to allow U.e temporary display of additional informaUon on demand. These can 
be used for the equivalent of footnotes, marginal notes, and glossary entries in paper documents. 

(iii) Folding — to allow sections of a document to be hidden behind a 'button' on the screen and revealed 
on request. Folding should be allowed to any level, so hidden secUons can contain further buttons. 

(iv) A linkage facility — to enable users to follow links or cross-references automaUcally. 
Item (i) is dealt with by style tables that select objects by class and required level. 

Item (ii) is dealt with by changing to another style table to produce a pop up display and then changing 
back again when the display is no longer required. 

Item (iii) is an extension of item (ii). The layout process needs be able to display either the .utton or the 
.tem(s) folded behind the button. One way lo do this is to have both the button text and the folded 
components as subordinates of the button object. The button is closed when a style table is applied that 
displays just the button text, and it is opened by applying another style table that displays the folded items 
(and possibly the button text as well). 

Item (iv) could be done in several ways depending on the typo of link. Three pos.sibilities are 
• Move tlie current point of display lo the target object. 

Display the target object (or subtree) as a temporary pop-up iiem. 

Include the target object (or subtree) at Uiis iwinl in the document. 
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These can be achieved with a combination of style tables and links. The style table specifies whether or 
not to display the linked object. Wlien ihc style table is changed the linked object can fx; displayed as a 
new layout object (like a aird), as a pop up item, or inserted inline with the surrounding content. 

3.3.1. Modelling Guide Buttons in ODA 

Guide [9, 10] is a hypertext system that supiwrts a hierarchical mode! of a documci t and also allows cross- 
linking of information. A typical Guide document prcsenUs Uic reader with a summ ry consisung mainly oi 
buttons. These can then be selected to revcitl grciitcr levels oi detail as required. Buttons may be nested 
many levels deep. The reader selects only the buttons he is interested in, and if he fmds he is not interested 
in Uic information revealed he can 'undo' the selection atid fold the informaUon back behind the button 
again. Guide is also a WYSIWYG editor. It allows the reader to edit the contents of ihe document and to 
add or delete buttons, thus becoming an auUior as well. The emphasis is on allowing the reader to tailor the 
document to his own requirements. 

The overall Guide model is similar to ODA's hierarchical model, but with the added concepts of 

(i) Folding logical items behind buttons. 

(ii) Allowing more than one button to access tlic same logical items. 

Guide's layout model is of a single long .scrollable frame holding all content except temporary pop-up 
items. Using an ODA framework could enrich the Guide layout model. To show how the Guide model fits 
with ODA, we shall inu-oducc two different lypcs of Guide button and explain how they might be 
represented'. (The examples use the UNIX version of Guide, whicii is similar to the version marketed by 
OWL for the Apple Macintosh [11] but differs in some details.) 

The commonest type of button is the replacement-button. When a replacement-button is selected, the 
button itself disappciu-s and is replaced by information that may in turn contain f-urther buttons. The 
replacement is inline, so surrounding text may be reformatted or scrolled out of the way to make room for 
the replacement. 

Figure 4 shows two different views of a Guide version of part of the ODA standard. In Figure 4(a) the 
visible text is made up entirely of buttons giving section headings. (By convention. Guide buttons appciir 
in a di.stincUvc font — typically in bold — so that readers can recognise them.) Figure 4(b) shows the 
result of selecUng the 'Object Descriptions' button. Two further buttons are shown within the replacement. 
The 'More' button is another replacement-button for th,e user U) select if he requires more detail. The 
words in italics are a different type of button known as a glossary-button. If the reader selects a glossary- 
button an explanation of the term appears temporarily in a separate window. 

To represent Guide buttons in an ODA document we would not set about defining a special new ODA 
object class for each type of button. Instead, for replacement-buttons, we would look first at the exisinig 
objecus in a document class, decide which were appropriate as buttons, and apply style tables that would 
make them behave like buttons. Sections might be considered suitable for use as buttons, in which case the 
subtitle might be displayed as the button text, and the whole object displayed when the button is selected. 
Other classes of object (list items for example) might be modified for use as buttons by adding some 
abbreviated version as a button text component. 

There are sevenU variations on the basic replacement-button. The simplest form is the local-button where 
the replacement applies only to the button itself. This is the default type described above. Two other forms 
are the definition-button and u.sa^e-builon. For definition-buttons the replacement applies not only to the 
button itself but also to usage-buttons with the same 'name'. (Guide provides a mechanism for attaching 
mrnies to the buttons > It might be more efficient to mirror diis in ODA by providing usage-buttons with 
button text and a lin.. to Uie appropriate definition-button object. This then becomes a general mechanism 
for attaching the subu-ee conUiining the replacement content to .several places in the document. 
Glossary-buttons are like fcwtnotes, annotations, glossary entries, or other embellishments to the main 
document. Unlike replacement-buttons dieir replacement is not part of the main document, instead it is 
typically a short piece of pop-up text. We could represent glossary-buttons in ODA by defining a new 
•Glossary-button' generic object with a generator for subordinates specifying a button text item and a 
'Glossary-tcr't item, 
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2.3.2 Content portion descriptions 

2.3.3 Object descriptions 

2.3.4 Object class descriptions 

2.3.5 Styles 

2.3.6 Document profile 

2.3.7 Document class descriptions 



(a) Sun^n^ary conuiining 
uncxpandcd buttons 
only 



2.3.2 Content portion descriptions 

2.3.3 Object descriptions 

Each object within a struclure is characterised by a set 
of aitributcs called an objeci description. 

Each aitribule has a value and may represent one of the 
following More 

2.3.4 Object cla.ss descriptions 



(b) Result of selecting 
'Object Descriptions' 
button 



rigure 4: Guide documcni showing (a) hiuion and (b) expanded button 

•Glossary-text* would normally be defined as a simple leaf objeci with character content (to represent the 
explanaUon text). However glos.sary-bullons arc intended to provide the .same explanation for each 
reference to a term or item throughout the document, so it is atlracUve to think of a variation, similar to the 
usage-button, with a link to the appropriate explanation text. 

3.3.2. Modelling KMS Frames in ODA 

KMS [3] supports a data model basal on workspaces known as frames. Frames may contain text, graphics 
and image items, and individual items within frames can be linked to other frames. There is no built-in 
notion of hierarchical organLsalion and no concept of a linear ordering of information. Information is 
divided into frame-sized chunks and one chunk is displayed in each window on the screen. The reader 
follows links to view different frames. 

In spile of this very general model, strong conventions have evolved for the format of frames and for 
distinguishing between hierarchic;) I links and oUicr links. Figure 5 shows the overall layout of a 
conventional KMS frame. (To avoid confusion ihis section will use 'KMS frame" and 'ODA frame' to 
distinguish tlie different meanings.) 

The generic logical objects defined to support a sUindard KMS database would correspond to the KMS 
frame and the items within the KMS frame. Figure 6 shows Uie top levels of a possible generic logical 
structure. 

The generic layout structure ibr a KMS frame would correspond to an ODA page with ODA frames 
representing the areas shown within the KMS frame in Figure 5. Layout object class would be used to 
direct c{ich KMS frame into a single instance of this ODA page, and layout category and permitted 
categories would be used to direct the different logical items into the appropriate ODA frames. 
The 'tree' and 'link' items would ix; set up like the replacement-buttons described for Guide in the previous 
section. Thus 'tree' items would be like (lefiniiion-buitons and would have two subordinates: Uie button 
text to bo shown in Uieir parent KMS fnune and iinothcr KMS frame (to be shown if Uie button is selected). 



Frame title 
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Frame body 



Tree items 
(links to frames 
at next level) 



Link items 
(cross-references) 



Command items 



Figure 5: Layout of a typical KMS frame 
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Figure 6: Generic logical structure for a KMS frame 

The Mink* items would be similar to usage-buttons. They would have button contents to be shown in their 
parent KMS frame, and a link to the remote KMS frame. The layout process could be relatively simple as 
it only needs to display complete KMS frames and to follow the primary and secondary links lo further 
KMS frames given in the *tree' and *link' objects. 



4* Conclusion 

A great deal of effort has gone into the producuon of the ODA standard and much pracucal experience has 
been gained. A new hypertext standard should not try to reinvent the wheel. We believe the best solution 
is to combine the existing expertise enshrined in the ODA (and SGML) communities with the expertise in 
the hypertext community. We must avoid having two or three separate standards and squandering the 
efforts of the few experts available. 
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standards for a Hypermedia Database: 
Diachronic vs. Synchronic Concerns 

Gregory Crane 
Perseus Project 
Department of the Classics 
Boylston 319 
Harvard University 
Cambridge MA 02138 

This paper outlines the perspectives of a professor in one traditional branch of the 
humanities (Classics). My colleagues and I are engaged in creating a hypennedia database 
on ancient Greek civilization, but our work is intended to explore the generic issues of 
building a complex hypermedia database, and Perseus was conceived as a model for what 
should (and no doubt should not) be done. We have encountered a number of problems 
along the way that must be solved before information disseminated in a hypermedia 
environment can have more than marginal impact on intellectual activity. This paper 
addresses hypermedia databases: although much of our work revolves aiound texts and 
still images, we can see that sound, animation, and motion video are also basic categories 
of information. This paper at least views hypertext as a subset of hypermedia. 

The argument of this paper can be summarized simply. Standards for hypermedia 
must emerge before hypemiedia databases can be fully useful, but long-lived standaids can 
only emerge after we know much more about how people will use hypcmiedia databases. 
Since we can do qualitatively different things in a hypennedia environment, we must 
assume that usage patterns will emerge. Practically speaking, we can expect to see short 
temi interchange tools so that we can move data iTom one hypertext system to another, but 
we should be prepared to abandon these standards if they prove too inflexible. The rest of 
this paper outlines some pragmatic concerns. 

Standards can be viewed as working in two dimensions, synchronic and diachronic. 
Synchronically, hypcmiedia .standards would allow all hypermedia systems at any one time 
to exchange and share information: thus, NoteCards, HyperCard, Intermedia, HyperTies, 
Guide etc. could all exchange the same data. Synchronic standards are, in some measure, 
feasible, and are a crucial first step. This paper, however, focuses on diachronic 
continuity: the same hypcmiedia database must be equally useable now and for many years 
to come. In fact, any hypemiedia database that fits cleanly into any existing hypermedia 
system will probably not long survive. Synchronic standards will provide us with 
experience and knowledge trial we can use to create truly diachronic standards. If we are 
lucky, synchronic will evolve into diachronic, without sharp breaks in continuity. 
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For many, synchronic is more important than diachronic continuity. We do not need 
to preserve for centuries all the product documentation for every computer system available 
in 1990. Even a 1970 paper on new directions in punch card technology, for example, 
would have little appeal to the engineer today. The Historian of Science may some day 
wish to study this technology, but we cannot preserve everything. In such areas, 
information must be disposable. 

The notion of disposable information has profound implications. If one's ideas will 
only be valuable for five or ten years anyway, then the author may not care very much if 
those ideas are stored in a hype.-media system that is itself equally ephemeral. Press, an 
early hypertext system released at Brown in 1971, was demonstrated at Hypertext '89, but 
it appeared there as an historical artifact rather than a living system (its official title was "A 
Blast from the Past: The Last (?) PRESS Demo". For others with a potential interest in 
hypermedia such as textbook publishers, short-lived systems are ideal, since they can thus 
attack the used-textbook market and force students to buy new electronic "textbooks" with 
greater regularity. 

It is hard to emphasize how destructive such attitudes are. True publication, however, 
implies that a document will be part of the public record for an indefinite period of time, not 
just for a few years. In many disciplines no scholar can afford to lavish time on creating 
documents that will not last at least thirty years and, hopefully, much longer. This holds 
true not just for humanists creating tools such as critical editions of authors (e.g.. Homer, 
Chaucer), dictionaries and commentaries, but for many other areas as well. 
Anthropologists, for example, working in Central Africa or Latin America have their own 
questions in mind, and their own conclusions may soon become dated. But they also 
create ethnographic descriptions of societies that are rapidly changing. Their published 
ethnographies may be our best (even our only) records of those societies, and these must 
become permanent part of our information infrastructure. We are constantly adding to our 
basic record of the world, and this record musi bs maintained for an indefinite future. 

The author who creates information and the system that stores that information are 
only two aspects to a larger whole. Consider, for a moment, one other critical group that 
must also embrace the idea of hypermedia and for whom longevity is even more important. 
The librarian must be able to leave information "on the shelf' for centuries rather than 
decades. No document will last long if it is not preserved ac a regular part of our research 
library system. I would like to emphasize that a standard that does not meet the most 
stringent needs of research librarians is, at best, a crude stopgap and, at worst, quicksand 
that will trap and overwhelm the unwary, and that will make subsequent travellers view 
hypermedia with distrust. 
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The problem from our perspective may be summarized as follows. Hypermedia 
systems offer tremendous potential and may ultimately revolutionize the way in which 
research is performed and disseminated. Hypermedia cannot, however, have the impact 
that it wamrnts until we can provide diachronic continuity, A database that runs on ten 
systems now (and thus provides synchronic continuity) and zero systems a decade from 
now does scholar and librarian little good. 
Problem 1: Exchanging Data 

Exchange standards offer one obvious approach to the problem of diachronic 
continuity. If we can exchange database Fred between N different systems at any one 
given time, then there is a high probability that Fred will be able to move into new systems 
that have not yet appeared. Fred may not take advantage of all the capabilities of its new 
environment just as a black and white silent movie does not exploit the full capabilities of 
the television on which it may be viewed, and in some ways performance in the new 
system may be weaker (e.g., video has inherently less resolution than any film and thus 
cannot reproduce all tlie information in any one frame of the film). But at least Fred, like 
the silent movie, will still be accessible. 

Converting hypermedia databases from one system to another is much more complex 
than transferring silent film to video, more complex, perhaps, than the problem of 
converting a play into a movie. For while the play and the movie have profoundly different 
options open to them, the script of the play (in most cases) provides a common linear path 
which both can share, and a movie can imitate the conventions of the stage. 

The conversion from one hypertext system to another may well prove more analogous 
to the problem of machine translation. Existing hypermedia databases and even standards 
for particular types of information (such as the SGML standard for text) are generally 
closer to syntax than semantics. They illustrate how various objects are put together, but 
they can only incorporate a limited amount of information about why the objects are put 
together in that particular way. The designers of the hypermedia database will 
unconsciously tend to rely on the peculiarities of the system that they are using. Authors 
organize their data differently when using a system in which scrolling windows can contain 
large documents (e.g. Intermedia, Notecards) than when working with an inherently 
"chunky" hypertext system (one built around many small cards) 

Consider two examples: 

1) HyperCard can easily store a hieraichical map. The user begins with a view of the 
world, zooms into a view of a particular country, and then calls up the plan of a particular 
city. A user can implement such a map easily with buttons containing goto's, but will an 



interchange program be able to recognize that these buttons represent, in fact, a logical 
hierarchy? If the interchange program cannot make such inferences, will it produce results 
like the machine translation system that interprets "time flies like an arrow" as "time-flies 
enjoy arrows" or as "time the flies (i.e. with a stopwatch)". If hierarchical structures of one 
kind or another are to be a building block for hypermedia systems, then all such systems 
must contain primitives that recognize these structures. 

2) Much discussion has gone into the creation of links between anchors in various 
documents. Document X would have a link to an anchor in Document Y, and the anchor 
would identify a particular point or selection in Document Y. This is a critical and generic 
concept, but, in some contexts, it replicates a function that text strings implicitiy perform: 

e.g. "Shakespeare Macbeth 1.7.1-2 'If it were done quickly"' defines a precise 

subset of the text. The text string is a high level construct that does not depend upon 
anchors into one particular document: it will work equally well whetiier the Riverside 
Shakespeare or the Folger edition of Macbeth is online. Does an automatic linking protocol 
really constitute an advance over such a reference, or even over a standard joumal reference 
(e.g. ''HSCP 91 (1987) 175 note 60")? If document (or an object in a museum for that 
matter) does not already have an anchor of this kind, then tiiat information has not been 
published in any meaningful sense. Publication presupposes the existence of canonical 
citation schemes. Where canonical citations schemes do not exist or are imperfect, then 
information, like a misshelved book, is lost. 

Second, publication (as in Augment) cannot be retracted. A statement, once it has 
been placed in the public domain can never be changed: it can be commented on, and its 
author may recant, but the statement must remain a part of tiie record. A publication system 
(as opposed to an autiioring system) should not accept vanishing links. 

New products such as SuperCard Sind Plus do attempt to interpret all the information 
within a HyperCard stack, but only because their own model of the world is a superset of 
the HyperCard model. Once a document is truly converted to either SuperCard or Plus: 
i.e., once it takes advantage of elements in the SuperCard or Plus model that are not 
available in HyperCard) tlien it cannot easily move back to HyperCard or even laterally to 
from SuperCard to Plus or vice versa. As soon as hypermedia systems begin to change 
their view of tiie world, then different systems will have different abilities. Translating 
from one environment to another becomes an interpretive act, in which human intelligence 
may prove irreplaceable for the forseeable future. 

The rest of this paper will cover problems that we in the Perseus Project have 
encountered in building a hypermedia database on ancient Greek civilization. The domain 
is relatively compact: 40 and 100 megabytes of source texts in original Greek and English 
translation, a dictionary, a small encyclopedia, essays, maps, plans, and 5,000 to 10,000 
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images of Greek sites, monuments, and art objects will provide a solid foundation for the 
study of this subject. Nevertheless, the problems inherent in managing such a 
heterogeneous database of this magnitude are substantial. 

More importantly, this data is intended to serve a wid*^ audience. First, it aims at 
different levels of expertise: the undergraduate in a general course and the professor doing 
research. Second, it aims at various kinds of expertise: the same data should be useable 
for the study of literature, art, history, linguistics and other subjects. In fact, both 
distinctions are related: the more accessible information about art is, for example, to the 
freshman, the easier it can be for literary critics, who do not now have easy access to that 
information, to use it in their work. 

Our work is, to a large extent, an experiment within which we are trying to identify the 
basic da a structures with which people work. Objects such as dictionaries, atlases and 
museum catalogue entries have evolved certain fairly stable forms that are based on 
functions that people seek to perform. As these tools migrate into an electronic 
environment they can perform new functions and their fornix will inevitably change. Until 
we have a better idea of what these new functions will be, however, we are not in a good 
position to build environments in which the form of inf^nnation can evolve. 
Data Models and Approaches: Some Concret** Problems 

Every discipline probably has its own proprietary data models which every expert 
must internalize. Thus, the mathematician must know how to create and present a logical 
proof, while the chemist needs to provide certain kinds of information when describing an 
experiment. The student of ancient Greek literature kqov/s how to read and to use a 
scholarly edition of a Gre^k text, while the archaeologist knows how to work with objects 
discovered on a dig. Hypermedia standards must provide a model in which each group can 
express as many significant features as possible. They must at least replicate the 
functionality of printed texts, but should also allow people to perform new operations. 

Defining a data structure is not an easy task. Even if we have a model that satisfies 
one group, another group may want to use the same information in different ways. The 
following section provides two general examples of the iterative process that we have had 
to undergo. The examples are fairly specific but they illustrate how difficult it will be to 
define what some people have in mind when they think about such basic categories as 
archaeological objects and source texts. TI problems below are very specific, and domain 
experts in various fields will have to create the actual specifications for these data 
structures. Nevertheless, the standards that evolve for hypermedia databases will 
determine how feasible it is for the domain experts to organize their information. The more 
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effectively authors can organize their data, the more useful the underlying standards will 
prove. Particular and domain specific as these problems may seem, they address 
fundamental data types. Until hypermedia standards provide a platiomi that supports such 
data types, hypemiedia cannot play a major role in the publication or the long term 
archiving of information. 

The classicist discussing Greek religion may, for example, use the painting on a Greek 
vase as evidence. He may point out that there is a man is leading a bull to an altar, that the 
man holds in his hand a sacrificial cake and some barley to sprinkle over the victim. He 
may draw attention to the kind of knife held or some other narticular of the scene. In this 
context, a single one bit deep bitmap may well contain all the information necessary, and 
the expert in Greek religion might want to collect a large number of such images. 

The art historian might want to study the style of the painter who created the picture. 
He would need to study "ery subtle details (such as the way in which anatomical details 
such as eyes or knees were rendered), but such detail will almost certainly lacking in the 
bitmap. The classicist can build up an enormous database of images which then prove to 
be of little use to his or her colleagues in archaeology or art history. 

Worse, the art historian may actually conclude that one bit deep images are all that the 
computer can offer and thus turn away from the new medium. Likewise, many viHeodiscs 
(to choose one technology) simply imitate image libraries, even though a single video 
image cannot approach the clarity of a 35 mm slide. Tlie art historian may thus conclude 
that a videodisc s just a poor substitute for a slide archive, but if the videodisc designer 
takes advantage of the storage space, then he or she can store multiple views of each 
complex slide and can provide much more information. A videodisc that stores details of 
every head in a series of paintings contains information tliat the slides do not, for the ability 
to move directly from head to head to head allows the reader to see the images in a different 
way than would the undifferentiated slides. In the case of images, the media available to us 
so far have been so primitive, tliat few of the scholars who really care about art, for 
example^ have been able to see much promise in electronic databases at all. 

Suppose, then, one builds up a database that serves the needs of both the classicist and 
the art historian. Thus, when we in the Perseus Project, for example, commission new 
photography of an art object, we collect multiple views: dozens for a single vase with many 
figures. A videodisc thus will have enough color views so that it will allow scholars to see 
more detail of the objects on the disc than could any affordabe printed publication. 

The case is not, however, closed. Up come the anthropologists, also expert in 
handling physical remains. For them, the detailed views are extremely useful, but they 
want to reconstruct day to day life of the period. The database of images focuses primarily 
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on the most eleganUy painted and attractive vases: the art historian wants to study the 
aesthetics of classical Greece; since carefully drawn and visually harmonious vases contain 
much of the information that the general classicist needs, the two groups work well 
together. The anthropologist wants to see what people actually used, not just the most 
polished specimens, but the coarse, hurriedly drawn pieces as well. Perhaps, he does not 
even want vases in particular, but tools and other objects that illustrate the kind of work that 
people performed. Again, the invidual entries for each object may be quite attractive, but 
the anthropologist might argue that the collection as a whole provides a biased picture of the 
ancient world. Nor are the anthropologist's complaints necessarily limited to gross 
selection of objects: he or she have very different kinds of questions that they are going to 
ask and if a database is going to serve their interests, then its structure will undoubtedly 
need to be changed. 

Literary texts offer similar problems, for different groups view texts in different ways. 
The text of Moby Dick, for example, is conceived of as a fairly stable text stream. The 
critic will refer to a particular chapter or perhaps a page in a particular edition, but what 
Melville wrote is clear enough. It is relatively easy to build a publication model for "text" if 
we think in terms of nineteenth century English and American novels (and if we do not 
think too deeply about the problem). 

CHECKED UP TO HERE. 

If we apply this concept to a text that was transmitted in manuscript, this model is 
inadequate. Every time a large document is copied by hand, mistakes appear, and these 
mistakes become compounded with each new copy. Over the course of centuries, many 
variant forms of the text evolve and only with the printing press can this process of 
dissolution be arrested. Nevertheless, the damage is done: editors must choose between 
many competing variants, and must tell the reader when they choose a reading from 
manuscript X or Y. The reader needs, at a minimum, to see what variants are available for 
any passage of text. Ideally, the system should be able to show the reader where editor A 
chooses different readings from editor B, or to show, for example, which corrections in the 
text were suggested before 1800. 
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Figure 1: Simplified view of a scholarly edition derived from 
various "manuscripts". Every line of text may involve an 
"editorial selection." 

Again, addiessing both the nineteenth century novel and ancient Greek literature forces 
us to broaden our model of what a text is. Nevertlieless, we are not finished. Consider a 
popular text that appears in various forms over a number of centuries. In the case of the 
Greek poet Aeschylus, for example, we assume that there is an original source text (i.e., 
what Aeschylus actually wrote) that we are trying to reconstruct. Ideally, we could treat 
Aeschylus like Melville if we had an authoritative edition of Aeschylus. In the case of a 
popular story, we may have multiple versions, none of which is associated with any 
dominant owner and each of which is essentially just as important as the others. Each 
version of the story may itself have its own manuscript tradition, but now we must 
consider a kind of compound versioning: a st0x7 consisting of multiple versions each of 
which has numerous textual variants. 
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Figure 2: A compound text, consisting of n scholarly texts (each 
of which may be constructed from a variety of manuscripts). 

On the otiier end. even the category of "manuscript" is not completely simple. A 
document may be preserved on a stone or clay tablet. The writing system used to store this 
text may be crude, and scholars may need to provide normalized transliterations that follow 
conventional spelling rules or add some standard kind of information (thus many editors of 
Greek inscriptions add accents to their final editions). In such cases, an edition may 
Liclude (1) a picture of the inscription, (2) a ti'ansliteration of tlie inscription without accents 
or word breaks that simply, (3) a regularlized form. The physical medium may be stone or 
(as in the case of much Akkadian and Sumerian material) clay tablet, but in many ways the 
problem i.s similar to that faced by someone transcribing a sound recording made by tlie 
speaker of a little known language. The ethnographer may well want to include a narrow 
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phonemic transliteration. Thus, we might outline the structure of a source document (of 
which a "manuscript" is one example) as: 
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Figure 3: Diagram for one t ixonomy of source documents (such 
as a manuscript or inscription). 



This diagram presents a basic data model that will solve many of the problems for 
storing nineteenth century novels, Greek plays, Akkadian myths, Greek and Akkadian 
inscriptions, and an anthropologist's verbal recordings made in the field. 

The particulars of this simplified model are less important than the process that led to 
its creation: had we standardized around the nineteenth century novel, the Greek play or 
the inscription, we would have adopted an impoverished data model. We need to view in 
as much detail as possible as many different kinds of text as we can before we assume that 
we know what a text is or what it can do. A system that can handle these functions must 
address links not simply from one document to another, but between text, pictures, sound 
and motion video. Until we have systems that actually perform these tasks, we will not be 
sure that our standards actually account for the problems that people need to solve. This 
kind of analysis has barely begun, and we have a long way to go before we reach any 
consensus as to how any basic categories of information should be organized. 

Hybrid Data models 

So far we have talked about simple data types that have analogues in the world of 
print. We can insulate the individual components of data from the vagaries of any one 
system by storing information in the most powerful medium possible. Thus, we at Perseus 
have pragmatically chosen to expend extra effort so that our information will be useful for a 
longer period of time: drawings are stored not as bitmaps but in Postscript; for still images 
we use 35 mm film rather than video. A single Postscript can generate multiple bitmaps at 
varying resolutions, and whatever the future of Postscript itself, subsequent graphic 
fomiats will probably be able to absorb most of the existing Postscript data. We will thus 
be able to upgrade our site plans and drawings to systems that do not rely on bitmaps. 
Slides, though not electronic, contain far more information than we can now reasonably 



store in digital form. Should new formats such as HDTV actually arrive within the next 
five to ten years, film will convert much more elegantly than inherently crude NTSC video 
signals with their limited resolution. None of the hypermedia or hypertext systems 
currently available can recognize sophisticated text structures that one can create in format 
such as SGML, but we store our texts in SGML and will be able to take advantage of more 
powerful hypertext systems as these emerge. 

Efforts are already underway to provide workable standards in at least some of these 
individual areas. The Text Encoding Initiative, funded primarily by the NEH and EEC,i is 
a widely supported effort to build basic document formats for humanists within the 
framework of SGML,. Storing images as slides or as postscript drawings is a pragmatic 
hedge rather than a workable standard. 

Work on texts or images in isolaUon is only part of the problem, for these are only 
some of the basic components out of which a hypermedia documents might be constructed. 
Once we know how to handle these individual pieces, a hypennedia system must then be 
able to make the individual pieces work together as a whole. If an historical source text, an 
atlas and a database of topographical images (i.e., p' ^ures showing buildings and places) 
all exist in the saine database, then it can become much easier for the person going through 
the historical document to locate places on a map and even to call up images of what that 
place looks like now. Someone, for example, reading in the Greek historian Herodotus 
about how the Greeks defeated the Persians in the battle of Salamis might thus call up a 
map on which Salamis appears, then view color images of the strait in which the battle was 
fought or the hilltop from which Xerxes, the Persian emperor, viewed the battle. 

Once traditionally discrete bodies of knowledge such as text, adas and image archive, 
can dynamically interact with one another, then new compound document types become 
feasible. A narrative on the batUe of Salamis might consist of (1) links to the relevant text 
sources, (2) a map of Salamis with various buttons which were in turn (3) links into the 
image archive showing what the strait of Salamis or the hilltop of Xerxes looks like. Nor 
should such links be entirely passive: an animated version of tlie battle could be overiayed 
onto the generic map. Rather than calling up an entire picture, the system should be able to 
crop a particulai- detail, so that th . view frames that particular hill, for example, on which 
Xerxes may have sat. A document may dynamically abstract and shape data from a larger 
data base. 

Such interactive and dynamic linlcs fulfill logical needs and will inevitably become part 
of the author's repertoire. An author should be able to create a document that pulls together 

^Thc Project Director for this is Dr. C. Michael Sperberg-McQucen, of the University of Illinois at 
Chicago Circle. 



and performs operations on material in a larger database. It is not enough, however, to be 
able to perform such actions in a particular system in a particular time. Once an author has 
published such a hypermedia document (perhaps as part of a book interpreting the wars 
between the Greeks and Persians), then scholars a century later must be able to view that 
hypermedia document and see exactly what the autlior saw. If this diachronic continuity is 
not feasible, then the hypermedia document may have been distributed but cannot properly 
be said to have been "published". Tme publication implies that the materifd will remain 
available for the indefinite future. 
Conclusions 

Wc should move as quickly as we can towards some kind of synchronic interchange 
standard for hypemiedia. We need to learn how well we can move fairly complex sets of 
dm and functionality between diverse systems (e.g. HyperCard, Intermedia, Notecards). 
Once we are able to perform this task for some data, we may well decide that the 
interchange format that developed is, in fact, too inflexible. With luck, this interchange 
format will be a powerful platform that can evolve into a standard that will provide scholars 
and archivists with the diachronic continuity that they require. We must, however, be 
prepared to discard that format. 

The risk is probably greatest for those of us creating databases: until we have 
diachronic standards, the infomiation that we create may be available in libraries, but it will 
not be part of the library system. It will be distributed, but not truly "published." 
Nevertheless, we cannot make much progress on standards without applying them to 
substantial and fairly complex bodies of data. 

From a practical point of view, we suggest that those developing interchange standards 
should plan to work from the beginning with one or more databases at least as large and 
complex as that of the Perseus Project. An interchange system that can move this database 
back and forth between three or more different hypermedia systems may not be perfect, but 
an interchange system that cannot satisfy this practical requirement will certainly not 
support the much greater challenges that it will face. 
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Abstract 

We describe a hypertext "meta-model"— one that provides an organization for the architec- 
ture of a hypertext model. The specific meta-model presented was developed in the context 
of the Trellis hypertext model. However the organization seems generally applicable to other 
models as well. As such the meta-model may be a good candidate for a hypertext reference 
model, and so we call it the Trellis liypertexi reference model. In this report we first describe the 
TVellis hypertext reference model, and then discuss the relationship of some hypertext-defined 
concepts to the reference model. 

1 Introduction 

As a side-product of our work developing the Trellis model of hypertext [SF89a], we have defined 
a "meta-model" that provides an organization for the architecture of the hypertext model. It is 
the purpose of this report to describe this meta-model within the context of the IVellis model and 
further to suggest that it is applicable to other models of hypertext as well. As such it may serve 
as an appropriate framework for the development of a general hypertext reference model. In this 
report we shall call the "meta-model" the Trellis hypertext reference malel, abbreviated as r-model, 
as a reflection of this application. The model of hypertext itself will be called the hypertext model, 
or more simply the model throughout the report. 

The Trellis hypertext reference model is based around a collection of representations of the 
hypertext at different levels of abstraction. Abstractions rango from the hypertext as a collection 
of abstractly-defined independent components through more concrete representations in which the 
characteristics ol the hypertext's physical display have boon established, to the view of the hypertext 
that iiLi projected on a physical display device for the benefit of the person reading the hypertext. 
The representations at a particular level of abstraction depend upon representations at a greater 
level of abstraction, and these dependencies are shov/n within the r-model. 

A description of the r-model follows in the next section. Section 3 discusses how selected 
components of existing hypertext systems and models fit into (or are omitted from) the r-model. 

''Supported in part by a grant from the National Science Foumlation, CCR-881031'2. 
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Figure 1: The Trellis Hypertext Reference Model (the r-niodel) 
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2 The r-model 

The r-inodol, shown symbolically in Figtirc 1, is soparatrd into i'wv logical levels. Within each 
level is found one or more representations of part or of all of tlie hypertext. Speaking quite 
broadly, the levels may be grouped into three overall categories: abstract, concrete, and visible,^ 
The abstract comi)oneut and abstract hypertext levels define an abstract representation of the 
pieces of the hypertext and of the hypertext itself. These abstractions are transformed into more 
concrete representations of the hypertext in the concrete context and concrete hypertext levels, 
representing first the presentation of the hypertext's content and then the mapping of that content 
into the displayed windows. The resulting concrete windows are then viewxnl, producing one or 
more displays on one or more physical display devices. In summary, the representations in the 
abstract component level are at the greatest level of abstraction and those in the visible hypertext 
level are at the lowest level. 

Each representation is shown in the figure as a box, A representation is itself an abstract 
concept — a consistent presentation of the hypertext elements of interest. Representations in the 
r-model may depend on the representations at a greater level of abstraction. Such a dependency is 
shown in the figure as an arc between the representations. Because a representation's dependencies 
are on those representations at a greater level of abstraction, and not on those at the same or low^er 
levels of abstraction, the abstract and concrete levels in the diagram are further subdivided. It is 
worth emphasizing that a representation may not actually corresj^ond to a separately-identifiable 
^'physical" representation of the hypertext; for example, the representation may be expressed as a 
mapping between elements of more abstract representations. 

We will now focus in turn on each of the levels of the r-inodel. In the following sections, we will 
describe the level, its representations, and discuss hie dependencies on representations at higher 
levels, 

2.1 Abstract hypertext 

An abstract hypertext description specifies a hypertext and its components, but does not describe 
the details of how the hypertext is to be presented to its reader, 

2,1 •! Abstract component level 

The organization of the three highest levels reflects a separation of the hypertext into structure, 
content, and context. The structure represents the elements of the hypertext and their relationships. 
The specific content of the hypertext as presented to the system's user reflects the context w^ithin 
the structure in which the content appears—in other words, the display of the content is modified 
to reflect its context. 

The representations within the abstract component level i)resenl the components that wmII be 
associated with one-anotlier to form the hypertext. Within the context of this level, the representa- 
tions are independent of each other - such associations will be made at lower levels of abstraction. 
Our abstract view of a hypertext separates out the hypertext's structure from the elements that 
many users perceive as composing th(* hypertext. In other words. t\w structure, perhaps a directed 
graph, is separated from the collection of contents that are to be displayed to the reader and the 

^Tho choice of these lev«els of representation parallels i\m\ e xpands Shaw's model of printed documents [ShaSO] 
which identities abstract, concrete, and viewing mai)pings for the docnnient. 
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collection of "buttons" that will be selected by the reader when moving from location to location 
in the hypertext. Additionally, it nuay be the case that the view of the hypertext presented to the 
reader combines together independent content elements into an integrated whole. The presence (or 
absence) of such composition is also represented abstractly at this level. We will now consider each 
of the representations in turn. 

One natural representation for the structure of the hypertext is as a network. In our own 
work, we use a Petri net structure, which provides automaton semantics as well as the network rep- 
resentation. However other graph-based structures are appropriate as well— for example automata 
such as deterministic finite automata or data structures such as directed graphs, trees, or lattices. 
The structure of the hypertext need not be limited to networks; indeed, it may be desirable to use 
representations that are not graph-based in form; for example constraint-based descriptions. Note 
that even in graph-based representations, there is no requirement that the elements of the structure 
be fully-connected. The necessary characteristics of the structure representation is that it provides 
the "placeholders" that will be associated with the hypertext's content and that it describes the 
relationships that exist among these placeholders. 

The abstract content is arbitrary in form. It may, for example, include textual, graphical, 
animated, or perhaps even audio and video material. The content may he specified directly or 
may be the result of a computation. While it does not contain links, it may incorporate markers 
that define a collection of potential locations for the mappings of links and their presentations that 
occur in lower levels of the r-model. The content may be described in a form that is independent 
of the eventual characteristics of its display, or indeed it may be described in a form that is highly 
dependent on the eventual display, Because of the flexibility of the mapping from content to 
structure in the next level, however, a display-independent representation seems most appropriate. 

The structure representation identifies the relationships among content elements but does not 
indicate how those relationships will be shown for selection by the hypertext's reader. The abstract 
buttons are abstractions of the ways in which the relationship can be displayed. Abstract buttons 
may themselves have content and an associated type. The content is i)rovided to specify what will 
be shown when the button is displayed. The type is needed to si)ecify how the button will be 
displayed and other characteristics of its behavior on displa <\\ \ selection. As with the content of 
the abstract content, the content of the abstract button is Vc.-::vble in form — in implementation it 
actually may be computed or it may be statically defined. 

The final component in this level, the abstract containers, differs from the others in that it 
is an abstraction of how the i)ieces of the hypertext will be combined when shown to the reader 
{how it will be aggregated and combined for display), and not of what is in the hypertext. For 
example, if several content elements are displayable, one possible presentation would be to show 
each element sepaiately while another would be to combine the separate elements into a composite, 
which would be presented to the reader as a unit. In the first case, one could say that a separate 
container had been associated with each separate content element, while in the second case, one 
container would hold all content elements. Such characteristics are abstracted at this level by the 
abstract containers. 

2.1 ^2 Abstract hypertext level 

The elements of the abstract component level ar(» not ronn^cted logetlier. as will be necessary to 
form a hypertext. This association is performed in the abstract hypertext level. The abstract 
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hypertext level does not, however, describe how these associations will be presented withm the 
display of the hypertext. This is left to the concrete context level. , , , f 

Tlie content-structure associations map together elements of the structure and elements of 
the abstract content. In a graph-based structure, one natural association is to map the content ele- 
niHits to the nodes of the graph. No restriction is expressed in the r-model on the kinds of mappings 
that are permissible -for example it may be useful to map a single content element to multiple 
locations in the structure, or conversely to map multiple content elements to a single location. In 
our own work, we have found the ability to map a single content element to multiple locations 
to be particularly useful. We have also found it useful to completely substitute a new collection 
of abstract contents and of content-structure associations while retaining the same structure-for 
example for related hypertext versions, where one may perhaps be a translation of the other 

The button-structure assodations map the structure's relationship and abstract buttons. 
A natural association in a graph-based structure is to map the abstract buttons to arcs in the 
graph. In our Trellis hypertext model, based on Petri nets, the mapping is between the c ass of 
node called a transition and the abstract buttons (i.e., there is no mapping of arcs in this particular 
graph structure). Again we emphasize that there are no limitations expressed on the form of the 
mapping, although we have found a one-to-one mapping to be the most useful. 

Finally, the container-structure associations describe the association of the structure, or 
of portions of the structure, to one or more abstract containers. One use of this association is to 
permit grouping of elements of the structure, which might in turn be displayed to the reader m a 
siiK^le physical window. Different kinds of composite displays would be represented as associations 
with different types of abstract containers. In general, the container-structure associations allow 
the partitioning of the subsequent display of the hypertext into one or more possibly overlapping 
pieces. 

2.2 Concrete hypertext 

Assume that a hypertext is presented to its reader or readers in one or more windows on one or 
more phvsical display devices.'^ A concrete hypertext description specifies what the contents ol 
each of ihese windows will look like but does not tie down how the windows are to be arranged 
on the displav. For example, one particular window may be shown on several separate displays. 
Furthermoro/the characteristics of the displays may be different; in this case the sub->.quent viewmg 
description will also indicate how the different visible effects specified by the concrete description 
are to be rendered on the displays. 

2.2.1 Concrete context level 

The pre^•ionslv-described levels have defined an abstract hypertext in which the content and the 
buttons have been associated with the structure. However, the abstract hypertext description does 
not indicate how links are to be presented in the display of the content. Such considerations of the 
mapping from ihe hypertext's abstract representation to its physical representation are addressed 

in the concrete context level. . rr^i.. 

The concrete content presents a physically-oriented description of the hypertext. This mapping 

must address the following points: 

--^^^^.^i^^r^-;,^^^ the hypertext (or portion of the hypertext) to be presented to the 

reader. 

r 
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• How is the abstract content to be formatted to fit within the display region? 

. are the buttons to be displayed? Will the display of the button modify the display of 
the content or Will the buttons and content be displayed independently? For example in 
our mitiaJ Trellia prototype (aMlis), we have provided externally represented buttons. In 
our subsequent prototype (^Trellis), we have also developed means for specifying that the 
button IS to be represented as a highlighted string within textual context [FS89a]. Note that 
button displays are not necessarily static; in some cases the display of the button depends on 
computed material (which itself may depend on the structural relationships in the hypertext) 
ihe button represents the source of a link in the hypertext ^ 

• Is the target of a link associated with a content element as a whole, or is it associated with 
rhfcorent? ^''^^""^ ^^^P^^y °f 

The mappings on this level do not rely directly on the structure (abstract component level) because 
ievel ^^^^*^°"«hips have been "encoded" into the representations of the abstract hypertext 

2.2.2 Concrete hypertext level 

The concrete context level has defined a set of concrete content elements in which a concrete 
representation of the content has been m.rged with concrete representations of the buttons. The 
concrete hypertext level maps those concrete representations into a set of windows for display The 
mapping which produces the concrete windows representation, also requires that link-based 
interrelationships among the windows be determined. For example, the process of following a link 
can result m several different display mappings: the display of the target of the link could replace 
he display of the source, could be shown in addition to the source, or could modify the display of 
the source, with both being shown in the same window. 

When the concrete windows representation has been formed, the presentation of the hypertext 
has been determined but the details of how and where the windows are to be displayed has not. For 

mTb^;h';;wn f r t *° ^ ""^^^ ^^^^^ °" ^ ^^^p^^y - ^ p-^^-i-^ window 

may be shown to several reader simultaneously on separate displays. Indeed, a particular reader 

dXlTn! / ' V ^^^P^^y« "^^y have equivalent but 

nlxUevel Particular visual eflects. Such considerations are addressed in the 

2.3 Displayed (visible) hypertext 

The details of the mapping from the concrete hypertext to the visible presentation of the hypertext 
for the reader are specified here." However, user interface details, such as the positioning and sizing 
of windows, are orthogonal to the r-model, as discussed later in this report 



Sec al:;o the comparison with anchors that follows in section 3.1.2. 
m,Hl^I2:T°"" " simplification, since the presentation is not limited to being visible. For example, it 
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2.3.1 Visible hypertext level 

An assumption in the i-model is that the underlying hypertext is to be permitted to be used in a 
distributed environment. The visible hypertext level reflects this assumption. Each visible HT 
segment is associated with a separate user and display. Each segment presents one or more of 
the active concrete windows to its viewer. The model does not prevent the display of a particular 
concrete window in more than one seg nent. Whether (and how) the effects of user interactions to 
one display may affect what is shown c n other user displays is a property of the hypertext model, 
and not of the r-model. 

3 Issues in application of the r-model 

We now turn our attention to three aspects of the r-model, which we shall consider in detail. In 
Section 3.1, we discuss some important components of hypertext systems and how they fit into the 
r-model. In Section 3.2, we turn our attention to central issues in implementation of a hypertext 
system that are orthogonal to our model-centered r-model. Finally, in Section 3.3, we discuss the 
intersection of our r-model with already-existing defined and delacto standards. 

3.1 Further discussion of elements of the r-model 

A number of structures and components have been identified for hypertexts.^ Here, we present 
some of these hypertext elements and describe their categorization within our reference model. 

3.1.1 Hypertext model structures 

We emphasize that the hypertext's abstract structure is arbitrary in form within the reference 
model. It may be graph-based, describing only object interrelationships, or it may also have 
automaton semantics. It need not be homogeneous in form; heterogeneous structures may be 
appropriate for .«^ome applications. It need not be static in form but may be dynamic. Indeed, it 
ueed not be explicitly computed or rei)reseiited. What is required, however, is that it be possible 
to intuit whore it is possible to include content in the hyi)ertext and ah-o the relationships between 
elements of the content. 

3.1.2 Anchors 

hi .some other models of hyi)ertext, anchors have l.)een identified as separatable component of 
a hyiiertext.'' The anchor represents the terminating point or points of a link. In one general 
form, anchors may be associated witii both the source and the target of a one-directional link in 
a hypertext. They present the velationshii) between the identili(>d ])ortioii of the source and the 
identified portion of the target. In other iini)hMnentatioiis, anchors are only associated with source, 
with the target being the node as a whole. In onr Tr(>llis implementations, anchors may or may 
not be associated with the source when no anchor is associated with a source then the link is 
represented by a (graphical) button in a separately displayed palette. 

^ie [LSK88]. for cxaniplo, for dcliiiitions of related terminology. 
'See, for example, the Doxtcr reference model [HS'JO]. 
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Within the r-rnodel, the display of anchors in source and target is specified in the mapping that 
defines the concrete content (concrete context level). Both the form of the display and also its 
position are described here. Issues involving positioning of the target content's display wh^n a link 
is followed are addressed in the definition of the concrete windows (concrete hypertext level). 

3.1 3 Different flavors of links 

A hypertext implementation may contain several different kinds of links, each with a different 
implemented action on selection. The distinction between the different types of link is reilected in 
the r-model by a difference between the types of their corresponding abstract buttons. 

The display of the source or target of a link may be static or may be computed. Such displays 
are described within the mapping that produces the concrete content representation. 

In some circumstances selection of a link may cause an apparent change to the displayed content, 
for example, insertion of the target's content into place in the source. When th( content actually 
changes in form, this is a matter of interest in the concrete content. However, when the content is 
actually unchanged in form, as is the case when the target material is inserted, this can be described 
through the display mapping that produces the concrete windows representation. 

3.1.4 Dynamic content 

Abstract content may be statically defined or it may be computed. It is useful to distinguish 
separate categories of computed content from one another. One such categorization distinguishes 

• Computed content: executor of an algorithm that produces a subsequently static display 

• Dynamic content: Dynamic execution of an algorithm: start on node entry, terminate on 
node exit 

• Filtered computation: Continuously-executing filter 

3.2 Orthogonal considerations 

The r-model is centered around organizing and categorizing the i)arts of a model of hypertext. 
Consequently, there are elements of an implementation, as well as elements of some hypertext 
models, that are not included in the r-model. These will be presented in this section of the report. 

3.2.1 Hypertext browsing semantics 

We have previously defined a hypertext system's browsing semantics [Sf\S!)a] as tho dynamic prop- 
erties of a reader's experience when browsing a document; in other words, as the manner in which 
the information within the hypertext is to be visited and presented, In most cases, browsing seman- 
tics are specified by the code that implements the hypertext system. However, it is also possiblo 
to develoj) a hypertext model with variable browsing semant ics; for example our Trellis hypertext 
model permits si)ecification of the hypertext's browsi!ig semantics [FSBOb].'^ Although specifiable 

^The behaviors assoriatcci willi difforcnt "ink types an- rcflcrtcd by their I.rowsiiiK semaiitics. ( ^jiisequently. 
variable browsing semantics are the ini])lementation mechanism for nser-c]efine<l link ty])es. as well as other brovvsinK 
behaviors. 



browsing semantics are in some hypertext models, they arc not in all, and so we have decided not 
to include them directly in the r-model. 

Similarly, we have not included the hypertext's dynamic behavior in the r-model. By dynamic 
behavior, we mean those cases in which a hypertext system traverses the structure without inter- 
vention from the reader [SF89b]. Dynamic behavior is distinct from dynamic content, however. As 
noted above, dynamic content is described within the model. 

3.2.2 Characteristics of the content 

Some hypertext systems may favor an organization in which each piece of content is treated as 
a small card-sized unit while others favor organizations in which the content is viewed as a long 
continuous scroll. Such considerations are outside of the scope of the r-modol. 

3.2.3 Physical-level descriptions and interchange descriptions 

If the structure of the implemented hypertext system closely parallels that of the r-model, it will 
certainly be necessary to define a storage format for those representations that ;.re specified directly 
as well as a description of the mappings that produce the others. However, the specific design of 
such storage formats is outside of the scope of the r-model, as is the equally-important design of 
formats designed to permit interchange between hypertext systems and installations. 

3.2.4 User interfaces 

(kM-tainly to the reader of a hypertext, the most visible component of the sy.stem is its user interface. 
However, the user interface is also an element of the system not discussed in the r-model. We note 
that it is possible to associate many diiferenl styles of user interface with the same underlying 
hypertext model. 

3.3 Intersection with existing standards 

There are two points of intersection between the r-model and existing standards. The first, in the 
abstract component level, are the abstractions used to define the al)stract content. An appropriate 
standard to consider for text, for example, would be SGML [ISOHO]. Similar utility could be made 
of standards to define graphical material as well as other content objects. It may be necessary, 
however, to augment these standard representations with additional information describing the 
potential interactions defined by the concrete-structure and button-structure associations, and as 

reflected in the concrete content. i n t 

The other point of intersection with proposed standards is in the visible hypertext level, bach 
visible IIT segment and user display may be based around a protocol such as duit of the X-windows 
system [SG8()]. Other defacto interface standards such as SunTools, OpenLook, Viewpoint, Motif, 
and NextStep are also applicable at this point. 

4 Discussion and conclusions 

We have descril)ed a meta-model of hypertext, which we call the r-mo.U>l, that helps to organize the 
portions of a hypertext model. It is possible that the hypertext model's design will also correspond 
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to the divisions establishod in the r-modol, but it is equally pormissiblo that the relationships 
be less-clearly drawn in the hypertext model. Furth(M-more, the implementation of the hypertext 
system may also correspond directly to the model or again distinct model concepts may be merged 
in implementation. 

In our own work in developing the Trellis hypertext model and prototype implementations, wo 
have tended to reflect the divisions of the r-model strongly in our hypertext model ; Mso to 
carry these divisions on into our implementation. In essence, our implementation is h. - on a 
collection of abstract data types, where the data types correspond to the representations in the 
r-model. A natural consequence of this retention of separation has been that it is ea.sy to extend 
the environment in which the implementation resides -for example to consider designs that permit 
multiple readers to be active in the hypertext at the same time that a writer is modifying it. 
Moreover the retention of .separation between structure, content, and context permits flexible reuse 
of the hypertext's structure and of the content of the hypertext. 

While we believe that direct application of the r-model has benefits in guiding the implemen- 
tation of a hypertext system, we also believe that a greater understanding of a hypertext model 
can be gained by casting it into the form of the r-model. It is this increased understanding that we 
believe is of primary importance outside of the context of our own development. 
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Abstract 

This paper presents the Dexter hypertext reference model. The 
Dexter model is an attempt to capture, both formally and informally, 
the important abstractions found in a wide range of existing and future 
hypertext systems. The goal of the model is to provide a principled 
basis for comparing systems as well as for developing interchange and 
interoperability standards. The model is divided into three layers, 
The storage layer describes the network of nodes and links that is the 
essence of hypertext. The runtime layer describes mechanisms support- 
ing the user's interaction with the hypertext. The within-component 
layer covers the content and structures within hypertext nodes, The 
focus of the model is on the storage layer as well as on the mechanisr'is 
of anchoring and presentation specification that form the interfaces 
between the storage layer and the within-component and runtime lay- 
ers, respectively. The m Jel is forma,iZed using Z [19], a specification 
language based on set theory. The paper briefly discusses the issues 
involved in comparing the chara teristics of existing systems against 
the model. 

V\cknow!edgeiT.ent: The model described in this paper grew out a series of workshops 
on hypertext. The following people attended these workshops and were instrumental in 
the development of the model Rob Akscyn, Doug Engelbart, Steve Fe.ner, Frank Ha- 
John Leggett, Don McCracken, Norm Meyrowitz, Tim Oren, Amy Pearl, Catherine 
Plaisant, Mayer Schwartz, Randy Trigg, Jan Walker, and Bill Wielan - The workshops 
were organized by Jan Walker and John leggett. 
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What do hypertfxt^systems such as NoteCards [10], Neptune [4], KMS 
[1], Intermedia [23] and Augment [6] have in common? How do they differ? 
In 'A-hat way do these systems differ from related classes of systems such 
a.s multimedia database systems. At a very abstract level, each of these 
hypertext systems provides its users with the ability to create, manipulate, 
and/or examine a network of information- containing nodes interconnected 
by relational links. Yet these systems differ markedly in ihs specific data 
models and sets of functionality that t.hey provide to their users. Augment, 
Intermedia, NoteCards, and Neptune, for example, all provide their users 
with a vifiivers* of arbitrary-length documents. KxMS and HyperCard, in 
contrast, are built around a model of a fixed-size canvas onto which items 
such as text and graphics can be placed. Given these two radically different 
de.signs, is there anything common between these systems m their notions 
of hypertext nodes? 

In aji attempt to provide a principled basis for answering these ques- 
tions, thif, paper presents the Dexter hypertext reference model. I'he model 
provides a standard hypertext terminology/ coupled with a formal mode! of 
the important abstractions commonly found in a wide range of hypertext 
systems Thus, the De.xter model .serves as a standard against which to com- 
pare and contrast the characteristics and functionality of various hypertext 
(aid pon-hypertext) systems. The Dexter model also serves as a principled 
basis on which to develop standards for interoperability and interchange 
among hypertext systems. 

The^Dcxter reference model described in this paper wa^ initiated as th^ 
result of two small workshops on hypertext. The first workshop was h^ld 
October. 1988 at the Dexter Inn in New Hampshire. Hence the name of the 
i-iodel. The workshops had representatives from many of the major existing 
hypertext systems^. A large part of the discussion at these workshops wa-s 
the elicitation of the abstractions common to the major hypertext systems. 
The Dexter model is an attempt to capture, fill-out, and formalize the results 
of these discussions. 

'The terms hypertext and tiyprer.rifdia arrt often differentiated, with hyper'ext referring 
to (ext-only systems md hyrN-rmedia referinij to ?ystcm<! that support miiUipIc media 
This disdnclion is not m^de in the present pap.-r: the term hypt-rtext is used genc^ricaiJy 
to refer to both text-only and multimedi?. systcir-s. 

Participants in the two workshops ate iistc-d in the acknowledgements on ths fivsl p^Ke 
of this paper. 

Among the syst. .ns tba. were- discusswi at the workshops were Augment, Concor- 
dia/Docuinent Examiner, IGD, FKESS. Intermedia, Hypertard, livperties. KMS/ZOG 
Ncptune/HAM, NoteC<i;ds, the Sun Link Service, and Textnet 
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Another Important focus of the workshops was an attempt to find a 
common terminology for the hypertext field. This turned out to be an 
extremely difficult task, especially so in the absence of an understanding of 
the common (and differing) abstractions among the various systems. The 
term "node" turned out to be especially difficult given the extreme variation 
in the use of the term across the various systems. By providing a weU- 
defined set of named abstractions, the Dexter model provides a solution to 
the hypertext terminology problem. It does so, however, at some cost. In 
order to avoid confusion, the model does not use contentious terms such as 
"node", prefering neutral terms such as "component" for the abstraction in 

the model. , . „ riAi r i 

In the present paper, the Dexter model is formulated in Z [19], a formal 
specification language based on typed set theory. The use of Z provides a 
rigorous basis for defining the necessary abstractions and for discussing their 
use and interrelationships. Although an understanding of the Z language 
is a prerequisite for fully understanding the details of the Dexter model as 
dpscribed in this paper, the paper attempts to provide a complete description 
of the model in the prose accompanying the formal specification. Readers 
unfamiliar with Z should be able to gain a full, if not precisely detailed, 

understanding of the model. . 

This pap«r also refers in passing to architectural concepts found in 
a number o. e.xisting hypertext systems including Augment [6], Concor- 
dia/Document Examiner [22], HyperCard [8], Hyperties [18], IGD [/], In- 
termedia [23], KMS [1], Neptune/HAM [4], NoteCards [10], the Sun Link 
Service [17], and Textnet [20]. The reader is assumed to be familiar with 
the general characteristics and functionality of these systems. Appropriate 
background material on these systems can be found in Conklm [3] and in 
the proceedings of the Hypertext 87 [11] and Hypertext 89 [12] conferences. 

This paper is divided in 4 main sections. The first section provides a 
bripf discursive overview of the entire model. The second section describes 
the storage layer of the model, both formally and informaUy. The ^>^-d 
section describes the runtime layer of the model in a similar manner, le 
final section discusses issues involved in comparing existing systems against 
the model. 



ERIC 



-97- 



Runtime Layer 

Presentation of the hypertext: 
user interaction: dynamics 




Storage Layer 

a 'database' containing a 
network of nodes and links 



Within Component Layer 

the content/structure Inside 
the nodes 



Focus of the 
Dexter Model 



Figure 1: Layers of the Dexter model. 



1 An Overview of the Model 

The Dexter model divides a hypertext system into three layers, the rtin- 
<ime layer, the storage layer and the within- component layer, as illustrated 
in Figure 1. The main focus of the model is on the storage layer, which 
models the basic node/Unk network structure that is the essence of hyper- 
text. The storage layer describes a 'database' that composed of a hierar- 
chy of^data^containing "components" which are interconnected by relational 
"links". Components correspond to what is typically thought of as nodes in 
a hypertext network: cards in iNWCards and HyperCard, frar.^es in KMS 
documents in Augment and Intermedia, or articles in Hyperties. Compo-' 
nents contain the chunks of text, graphics, images, animations, etc. that 
form the basic content in the hypertext network. 

The storage layer focuses on the mechanisms by which the components 
and links are "glued together" to form hypertext networks. The components 
are treated in this layer as generic containers of data. No attempt is made 
to model any structure within the container. Thus, the storage layer makes 
no differentiation between text components and graphics components. Nor 
does it provide any mechanisms for dealing with the well-defined structure 
inherent within a structured document (e.g., an ODA document) compo- 
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nent. 

In contrast, the within-component layer of the model is specificaUy con- 
cerned with the contents and structure within the components of the hyper- 
text network. This layer is purposefuUy not elaborated within the Dexter 
model. The range of possible content/structure that can be included in a 
component is open-ended. Text, graphics, animations, simulations, images, 
and many more types of data have been used as components in existing 
hypertext systems. It would be foUy to attempt a generic model covering 
aU of these data types. Instead, the Dexter model treats within-component 
structure as being outside of the hypertext model per se. It is assumed 
that other reference models designed specifically to model the structure of 
particular applications, documents, or data types (ODA, IGES, etc) will be 
used in conjunction with the Dexter model to capture the entirety of the 
hypertext, including the with-component content and structure. 

An extremely critical piece of the Dexter model, however, is the inter- 
face between the hypertext network and the within-component content and 
structure. The hypertext system requires a mechanism for add.-essing (refer- 
iiig to) locations or items within the content of an individual component. In 
the Dexter model, this mechanism is know as anchoring. The anchoring 
mechanism is necessary, for example, to support span-to-span Unks such 
as are found in Intermedia. In Intermedia, the components are complete 
structured documents. Links are possible not only between documents, but 
between spans of characters within one document and spans of characters 
within another document. Anchors are a mechanism that provides this 
functionality while maintaining a clean separation between the storage and 
within-component layers. 

The storage and within-component layers treat hypertext as an essen- 
tially passive data structure. Hypertext systems, however, go far beyond 
this in the sense that they provide tools for the user to access, view, and 
manipulate the network structure. This functionality is captured by the 
runtime layer of the model. As in the caoe of within-component structure, 
the range of possible tools for accessing, viewing, and manipulating a hy- 
pertext networks is far too broad and too diverse to allow a simple, generic 
model. Hence the Dexter model provides only a bare-bones model of the 
mechanism for presenting a hypertext to the user for viewing and editing. 
This presentation mechanism captures the essentials of the dynamic, inter- 
actional aspects of hypertext systen , but it does not atlen-pt to cover the 
details of user interaction with the hypertext. 

As in the case of anchoring, a critical aspect of the Dexter model is the 
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Figure 2: Dlustration of the need for presentation specifications on the access 
path (i.e., links) as well as on the components themselves. 

interface between the storage layer and the runtime layer. In the Dexter 
model this is ax^complished using the notion of presentation specijxcations 
Presentation specifications are a mechanism by which information about 
how a component/network is to be presented to the user can be encoded 
into the Hypertext network at the storage layer. Thus, the way in which a 
component is presented to the user can be a function not only of the specific 
hypertext tool that is doing the presentation (i.e., the specific runtime layer) 
but can also be a property of the component itself and/or of the access; path 
(link) taken to that component. 

Figure 2 iUustrates the importance of the presentation specifications 
mechanism. In this figure, there is an animation component taken from 
a computer-based training hypertext. This animation component can be 
accessed from two other components, a "teacher" component and a "stu- 
dent" component. When following the link from the student component 
the ammation should be brought up as a running animation. In contast 
when coming from the teacher component, the animation should be brought 
up in editing mode ready to be altered. In order to separate these two cases 
the runtime layer needs to access presentation information encoded into the 
links in the network. Presentation specifications are a generic way of doing 
just this. Like anchoring, it is an interface that allows the storage layer to 
communicate in generic way with the runtime layer without violating the 
separation between the two layers. 

Figure 3 attempts to give a flavor of the various layers of the Dexter 
model as they are embedded within an typical hypertext system. The fig- 
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Runtime Layer Storage Layer Within-Component 

Laydr 



Figure 3: A depiction of the three layers of the Dexter model as embedded 
in an actual hypertext system. 

ure depicts a 3 node/1 link hypertext network. The storage layer contains 
four entities: the three components (i.e., nodes) and the link. The actual 
contents (text and graphics) for the components are located to the right of 
the storage layer in the within-components layer. In the runtime layer, the 
single graphics component is being presented to the user. The link emanat- 
ing from this node is marked by an arrowhead located near the bottom of 
the node's window on the computer screen. 

2 Simple Storage Layer Model 
2.1 An Overview of the Storage Layer 

The storage layer describes the structure of a hypertext as a finite set of 
components together with two functions, a resolver function and an accessor 
function. The accessor and resolver functions are jointly responsible for 
"retrieving" components, i.e., mapping specifications of components into 
the components themselves. 

The fundamental entity and basic unit addressability in the storage layer 
is the component. A component is either an atom, a link, or a composite 
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entity made up from other components. Atomic components are primitive 
in the (storage layer of the) model. Their substructure is the concern of the 
within-components layer. Atomic components are what is typically thought 
of a "node" in a hypertext system, e.g., a card in NoteCards, a frame in 
KMS, a document in Intermedia, a statement in Augment. Links are entities 
that represent relations between other components. They are basiciiUy a 
sequence of 2 or more "endpoint specifications" each of which refers to (a 
part of) a component in the hypertext. The structure of links will be detailed 
below. Composite components are constructed out of other components. 
The composite component hierarchy created when one composite component 
contains another composite is restricted to be a direct-acycUc graph (DAG), 
i.e., no composite may contain itself either directly or indirectly. Composite 
components are relative rare in the current generation of hypertext systems. 
One exception is the Augment system where a document is a tree-structured 
composition of atomic components called statements. 

Every component has a globally unique identity which is captured by 
its unique identifier (Ui'D). UIDs are primitive in the model, but they are 
assumed to be uniquely assigned to components across the entire universe of 
discourse (not just within the context of a smgle hypertext). The accessor 
function of the hypertext is responsible for "accessing' a component given 
its UID, i.e., for mapping a UID into the component "assigned" that UID. 

UIDs provide a guaranteed mechanism for addressing any component 
in a hypertext. But the use of UIDs as a basic addressing mechanism in 
hypertext may be too restrictive. For example, it is possible in the Augment 
system to create a link to "the statement containing the word 'pollywog'". 
The statement specified by this link may not exist or it may change over 
time as documents are edited. Therefore, the link cannot rely on a specific 
statement UID to address the target statement. Rather, when the link is 
followed, the specification must be "resolved" to a UID (if possible), which 
then can be used to access the correct -component. 

This kind of indirect addressing is supported in the storage layer using 
component specifications together with the resolver function. The resolver 
function is responsible for "resolving" a component specification into a UID, 
which can then be fed to the accessor function to retrieve the specified com- 
ponent. Note, however, that the resolver function is only a partial function. 
A given specification may not be resolvable into a UID, i.e., the component 
being specified may not exist. However, it is the case that for every com- 
ponent there is at least one specification that will resolve to the UID for 
that component. In particular, the UID itself may be used as a specifier, in 
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which case the resolver function is the identity function. 

Implementing span-to-span links (e.g., in Intermedia) requires more than 
simply specifying entire components. Span-to-span linking depends on a 
mechanism for specifying substructure within components. But in order 
to preserve the boundary between the hypertext network per se and the 
content/structure within the components, this mechanism cannot depend 
in any way on knowledge about the internal structure of (atomic) compo- 
nents. In the Dexter model, this is accomplished by an indirect addressing 
entity called an anchor. An anchor has two parts: an anchor id and an 
anchor value. The anchor value is an arbitrary value that specifies some lo- 
cation, region, item, or substructure within a component. This ancho, value 
is interpretable only by the applications responsible for handling the con- 
tent/structure of the component. It is primitive and unrestricted from the 
viewpoint of the storage layer. The anchor id is an identifier which uniquely 
identifies its anchor within the scope of its component. Anchors can there- 
fore be uniquely identified across the whole universe by a component UID, 
anchor id pair. 

The two part composition of anchor is designed to provide a fixed point 
of reference for use by the storage layer, the anchor id, combined with a 
variable field for use by the within-component layer, the anchor value. As 
a component changes over time (e.g., when it is edited within the runtime 
layer), the within-component application will change the anchor value to 
reflect changes to the internal structure of the component or to reflect within 
component movement of the point, region, or items to which the anchor 
is conceptually attached. The anchor is, however, will remain constant, 
providing a fixed referent that can be used to specify a given structure 
within a component. 

The mechanism of the ancnor id can be combined with the component 
specification mechanism to provide a way of specifying the endpoints of 
a link. In the model, this is captured by an entity called a specifier which 
consists of a component specification, an anchor id, and two additional fields: 
a direction and a presentation specification. A specifier specifies a component 
and an anchor 'point' within a component that can serve as the endpoint 
of a link. The direction encodes whether the specified endpoint is to be 
considered a source of a link, a destination of a link, both a source and a 
destination, or neither a source nor a destination. (These are encoded by 
direction values of FROM, TO, BIDIRECT, and NONE, respectively.) The 
present specification is a primitive value that forms part of the interface 
beiween the storage layer and tho runtime layer. The nature and use of 
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Figure 4: A depiction of overall organization of the storage layer including 
specifiers, links, and anchors. 



present specifications will be discussed in conjunction with the runtime layer 
below. 

Returning to the issue of link components, it is now possible to describe 
their structure a bit more precisely. In particular, a link is simply a sequence 
of 2 or more specifiers. Note that this provides for links of arbitrary arity, 
despite the fact that binary links are standard in existing hypertext systems. 
Directional links, also standard in existing systems, are handled using the 
direction field in the specifier. 

Figure 4 depicts the overall organization of the storage layer including 
specifiers, links, and anchors. The figure depicts 5 components including 3 
atomic components, 1 composite component (that constructed from two of 
the atomic components plus some text), and 1 link component that repre- 
sents a connection from the anchor (i.e., span) within an atomic component 
(#3346) to the anchor (span) in the composite component (#4112). 

In the foregoing discussion, components were described as being either 
a atom, a link, or a composition of other components. In actuality, this 
describes what the model calls a base component. In contrast, components 
in the model are complex entities that contain a base component together 
with some associated component information. The component information 



-104- 



describes the properties of the component other than its 'content'. Specifi- 
cally, the component information contains a sequence of anchors that index 
into the component, a present specification that contains information for the 
runtime layer about how the component should be presented to the user, 
and a set of arbitrary attribute/value pairs. The attribute/value pairs can 
be used to attach any arbitrary property (and its value) to a component. For 
example, keywords can be attached to a component using mutiple 'keyword' 
attributes. Similarly, a component type system can be implemented in the 
model by adding to each component a 'type' attribute with an appropriate 
type specification as its value. 

In addition to a data model, the storage layer defines a s naJl set of op- 
erations that can be used to access and/or modify a hypertex All of these 
operations are defined in such a way as to maintain the invariants of the 
hypertext, e.g., the fact that the composition hierarchy of components/sub- 
components is acyclic. The operations defined in the model include adding 
a component (atomic, link or composite) to a hypertext, deleting a compo- 
nent from the hypertext, and modifying the contents or ancilliary informa- 
tion (e.g., anchors or attributes) of a component. There are also operatons 
for retrieving a component given its UID or any specifier that can be re- 
solved to its UID. Finally, there is one operation needed for determining the 
interconnectivity of the network structure. This operation, linksToAnchor, 
returns the set of links that refer to an anchor when given the anchor and 
its containing component. 

2.2 Formalization of the Storage Layer 

As described above, we envision a hypertext system consisting of a set of 
components, each of which has a UID from the given set UID. 

[UID] 

Retrieving a component involves finding its UID and then using that 
UID to get hold of the actual component; this is accomplished by means 
of an accessor function which returns a component given its UID. UIDs are 
normally not meant to be visible to clients of a hypertext system. Given 
a component specification, it may be possible to find the UID to which 
the component specification refers, by means of a reso/vcr function. Com- 
ponent specifications arise from the given set COMPONENT^PEC. We 
also have a description for the visual presentation (present spec) of a com- 
ponent, which as part of a component is used in the run-time layer but 
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not in the storage layer; these visual descriptions come from the riven set 
PRESENT JSi^EC. 

[COMPONENTJSPEC, PRESENT^PEC] 

Links are an important kind of component and are supported in every 
hypertext system. Directionality is sometimes important for links, while at 
other times it immaterial. We introduce DIRECTION as a free type to 
model re -actively the end of a link as a source, as a destination, as both a 
source a..;i uestination, or as neither. 

DIRECTION ::= FROM | TO \ BYDIRECT \ NONE 

The schema type SPECIFIER essentially takes the form of the descrip- 
tion of one end of a "link." This description is sometimes sufficient to 
determine the UID of the component at one end of a link. As described in 
the overview, anchoring plays an important part in the model. Anchors are 
identified by means of a unique (to a component) anchor id from the given set 
ANCHOR^ID. Anchor values come from the given set ANCHOR-VALUE. 
Anchors are then just pairs of anchor id and associated anchor value. 

[ANCHOR-ID, ANCHOR- VAL UE] 

ANCHOR == ANCHOR-ID x ANCHOR-VALUE 

A value of type SPECIFIER describes a single end of a link. We include 
the variable presentSpec in the SPECIFIER schema so we can model differ- 
ent ways of visually showing links as we follow them (based on the specifier 
used), as illustrated in the example shown in Figure 2. 

_ SPECIFIER 

componentSpec : COMPONENT-SPEC 
anchorSpec : ANCHOR J D 
presentSpec : PRESENT-SPEC 
direction : DIRECTION 



Links must include at least two specifiers. What appear to be one-way 
Unks, such as HyperCard buttons, can be modeled as two-way links with the 
button end having a DIRECTION with value NONE and the other end 
having a DIRECTION with value TO. The two specifiers Dnk constraint 
simplifies the hypertext model. On the other hand there is no reason not 
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to have multi-way links, and so the model accomodates them. In the most 
general model, duplicate specifiers are allowed. The only constraint is that 
at least one specifier have a direction of TO. 

LINK 

specifiers : seq SPECIFIER 

# specifiers > 2 

3 s : ran specifiers • s.direction — TO 



A base component (a generalization of the traditional "node*" or "link") 
of a hypertext can either be 

• an atomic element which is modeled by the given type ATOM, 

[ATOM] 

models a "node" of a typical hypertext system but with the internal 
detail omitted. 

• a link which is modeled by the LINK schema gi' en above, or 

• a composite which can be described recursively a; a sequence of base 
components. 

Components can have ancillary information associated with them, such 
as attribute/value pairs, anchors, or presentation information. Most hyper- 
text systems allow for attributes of components. These attributes can be 
thought of as attribute/value pairs which can be modeled as a partial func- 
tion mapping attributes to values. We thus introduce two additional given 
sets, one for the set of attribute names and the other for the set of possible 
values: 

[ATTRIBUTE, VALUE] 

The additional information associated with a base component, which was 
mentioned above, can be captured in the following schema. We include the 
invariant that anchor ids are unique within a given component, i.e., the 
number of anchors within a component is equal to the size of the set of 
(different) anchors within the component. 
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-COMP^NFO^ 

attributes : ATTRIBUTE VALUE 
anchors : seq ANCHOR 
presentSpec : PRESENTJSPEC 

4 anchors = #(/ir^'lj»'an anchors)) 



Note that a presentSpec always has some value. We introduce the function 
minmfo which returns an instance of this schema with "minimal informa- 
tion," that is, no attributes, no anchors and a presentSpec which is given as 
an argument. 

minlnfo : PRESENT.SPEC ^ COMPJNFO 

Vps : PRESENTSPEC • 

minInfo{ps) = (/i info : COMP.INFO \ 
info. attributes = 0 A 
info. anchors = () A 
info. presentSpec = ps) 

We use the rec trsive type, BASE.COMPONENT, to describe the base 
components of a hypertext system. 

BASE.COMPONENT ::= atom{{ATOM)) 

I link{{LINK)) 

I composite{{seq BASE^COMPONENT)) 

Finally, the schema COMPONENT represents a base component along with 
its associated information. 

COMPONENT 

camp Base : BASE^COMPONENT 
complnfo : COMPJNFO 



The functions defined in the remainder of this section are there just 
to make the specification of the model easier to read and understand — 
they are not meant to have any particular significance in their own right. 
The following function builds a component given its base component and 
associated information. 
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component : BASE^COMPONENT x COMP^INFO 
^ COMPONENT 

component = (A b : BASE^COMPONENr; i : COMPJNFO • 

(/ic:COi\/POiv/i;.vr| 

cxompBase = 6 A 
cxomplnfo = t)) 

The following two functions extract respectively the base component and 
associated information of a component, 

base : COMPONENT BASE^COMPONEST 
info : COMPONENT - COMP^INFO 

Vc : COMPONENT • 

6a5e(c) = cxompBase A 
info{c) = cxomplnfo 

We introduce three predicates (prefix relations) which are respectively 
true iff a component is an atom, a link, or a composite. 

isAtom- : P COMPONENT 
isUnk^ iP COMPONENT 
isComposite-: P COMPONENT 

Vc : COMPONENT • 

isAtom c ^ base{c) G ran atom A 
isLink c 6a6^e(c) G r:\n link A 
isCompositec base{c) € ran composite 

We also define a "type** consistency relationship between components — 
that is, two components are "type consistent** is they are both atoms, both 
links, or both composites. 

«typeConsistent.: COMPONENT ^ COMPONENT 

Vci,C2 TCOMPONENT • 
c\ typeConsistentC2 ^ 

(IsAtom c\ A isAtom C2) V 
(isLink c\ A isLink C2) V 
(isComposite Ci A isComposite C2) 

Because link components are referred to quite frequently in what follows, 
we introduce the schema LinkComp so we can define variables of that type. 
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p. LinkComp 

COMPONENT 



compBase € ran link 



We also introduce some helpful functions to extract the various parts 
that make up a base component type. The first two functions are only 
defined for link components and return respectively the set of component 
specs for the link and the set of anchor ids for the link. 

componentSpecs : LinkComp -h F COMPONENT^PEC 
anchorSpecs : LinkComp -h F ANCHORED 



V c : LinkComp • 

componentSpeci^ic) = {cs : COMPONENT^PEC \ 
3s : Thnilink^ibas€{c))).sp€cifiers • 
cs - s. component Spec] A 
anchorSpecs{r) = {as : ANCHOHJD \ 
3s : v^n{link'^[base{c))). specifiers ♦ 
as — s.anchorSpec) 

The next two functions are defined for any component and return respec- 
tively its attributes and its anchors. 



attributes : COMPONENT [ATTRIBUTE 
anchors : COMPONENT ~* F ANCHOR 



VALUE) 



Vc : COMPONENT • 

attributesic) ~ {info{c)). attributes A 
anchorsic) = i:A.n{info{c)). anchors 

Finally, we introduce a function which given a component returns a 
component just like the given one except that the attributes function is 
(possibly) overwritten with a new value for a given attribute. 
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modifyAttnbute : COMPONENT X 'TTRIBUTEx VALUE 

COMPONENT 



modifyAttributc = (A c : COMPONENT; a : ATTRIBUTE; 

V : VALUE • 
in c' : COMPONENT \ 3 t, i' : :OMP-INFO \ 
i = info{c) • 
i' .attributes - i.attTibutes® {o y} A 
i'. anchors - i,anchors A 
i' .presentSpec - i.presentSpec A 
c' = component{base{c),i'))) 

Components can have sub-components and the same component may be 
a sub-component to more than one component. This relationship will be 
denoted by -subcomp.and is defined below. 

_snhcomi>-: COMPONENT ^ COMPONENT 

Vci,C2 . COMPONENT • 
ci subcomp C2 O 

base{ci) € ran(compos»<e~(6ase(c2))) 

A hypertext system, modeled by the schema PROTO-HYPERTEXT , 
has three parts. (1) The set of components represents the traditional "nodes'* 
and "links" of a hypertext system. (2) A partial function termed the resolvcr 
returns the UID for a given component specifier. Note that more than one 
specifier may return the same UID. (3) To actually get hold of a component, 
we introduce an accessor function which given a UID returns a component. 
Note that this function while partial, is invertible. 

_ PROTOJiYPERTEXT — 

components : F COMPONENT 
resolver : COMPONENT JSPEC UID 
accessor : UID y** COMPONENT 



To identify those links resolving to a given component, we introduce the 
function linksTo which, given a hypertext system and the UID of a compo- 
nent in the system, returns the UlDs uf links resolving to that component. 
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linksTo : PR OTO.HYPERTEXT x i'lD ^ F UID 

linksTo = {XH: PROTO^HYPERTEXT; u : UID • {uid : UID \ 
{3comp : LinkComp \ comp € H .components • 
uid = H .accessor" {comp) A 
(3 5 : COMPONENT-SPEC j 
s € componentSpecs{comp) • 
u = H .r€solver{s)))}) 

There are four constraints which must be satisfied by an instance of the 
schema PROTOJIYPERTEXT before we can caU it a HYPERTEXT. 

• The accessor function must yield a value for every component. Be- 
cause this function is invertible, every component must then have a 
UID. 

• The resolver function must be able produce all possible valid UIDs. 

• There are no cycles in the component-subcomponent relationship, that 
is no component may be a subcomponent (directly or transitively) of 
itself. 

• The anchor ids of a component must be the same as the anchor ids of 
the component specifiers of the links resolving to the component. 

J- HYPERTEXT 

P ROTO ^HYPERTEXT 

Vc : components • c ^ ran accessor 
ran resolver — dom accessor 
Vc : components • (c,c) 0 („subcomp_)' 
Vc : components • 3 lids : F UID \ 

lids = linksTo{$PROTO^YPERTEXT,accessor^{c)) • 
firstflanchors{c)) — 

\J{{anchor Specs o accessor)\lids^) 



2.3 Adding New Components 

In this section the model adding a new component to a hypertext. The 
last function defined in this section, Create New Component, is the function 
actually called from the run-time layer and is also part of the external view 
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of the model. (See the section on conformance with the reference model for 
more about this external view.) 

Adding a new component to the hypertext is given by the following 
function. It ensures that the range of the accessor function is extended to 
Include the new component. The resolver function is also extended so that 
there is at least one specifier for the new component's corresponding UID. 

createComponent : HYPERTEXT x COMPONENT 

^ HYPERTEXT 

V/f : HYPERTEXT; c : COMPONENT • 
3H' : HYPERTEXT \ 

H' .components = H .components U {c} A 
(3i uid : UID • 

{3componentSp€c : COMPONENT^SPEC • 
H'. accessor =: H, accessor V {xiid c} A 
W.resolver = H .resolver U 

{componentSpec tiid})) • 
createComponent{H yc) = H' 

The functions for creating a new node, link, and composite respectively 
are given below. They use the function createComponent described above. 

create AtomicComponent : HYPERTEXT x ATOM 

xPRESENT^SPEC HYPERTEXT x COMPONENT 

: HYPERTEXT; a : ATOM\ ps : PRESENT^SPEC • 
3 c : COMPONENT \ c = component{atom{a),minInfo{ps)) • 
create AtomicComponent{ H , a, ps) = 
{cr€at€Component{H ^ c), c) 

In creating a link, we must ensure that all of its component specifiers re- 
solve to existing components. To test for such consistency among links we 
introduce the foOowing link consibtency predicate as a prefix relation. 



linkConsistent^ : P HYPERTEXT 



^ H : HYPERTEXT • 
linkConsistent H ^ 

(H'.LINK; s: SPECIFIER] 

(3 cl : LinkComp \ cl 6 H .components • 

I = link-^ibaseicl))) A 
5 € ran I. specifiers • 

(3 c : COMPONENT \ c e H. components • 

{H. accessor o H .resolver){s.componentSpec) = c)) 

Creating a new link component is then given by the following function. 

create LinkComponent : HYPERTEXT x LINK x PRESENT^PEC 
-» HYPERTEXT x COMPONENT 

^H : HYPERTEXT; I : LINK-, ps : PRESENT^PEC • 
3 H' : HYPERTEXT; c : COMPONENT \ 
c - component{Hnk{l),minInfo{ps)) A 
H' = createComponent{H ,c) A 
create LinkComponent{H J, ps) = (/f',c) • 
linkConsistent H' 

In creating a composite we must ensure that any subcomponents of the new 
composite are already in the hypertext. 

createCompositeComponent : 

HYPERTEXT x seq BASE^COMPONENT 

X PRESENT JSPEC - HYPERTEXT x COMPONENT 

V// : HYPERTEXT; s • seq BASE^COMPONENT; 
ps : PRESENT^PEC • 
3 newComp : COMPONENT \ 

newCnmp = compon/.nticomposit€{s), minInfo{ps]) * 
createCompositeComponenti //, s, ps ) = 

{createComponentiH , newComp),n€U'Comp) A 
(Vc : COMPONENT \ bas€{c) e ran 5 • 
c 6 H .components) 

We package creating a new component with the following function. This 
is the function which will ultimately be invoked from the run-t'me layer. 
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CreateNewComponent : HYPERTEXT x BASE^COMPONENT 
X PRESENT JSPEC HYPERTEXT x COMPONENT 



"iH : HYPERTEXT; be : BASE^COMPONENT] 
ps : PRESENT JSPEC • 
((3 a : rOAf • 6c = afom(a)) 

Cr€at€Neu)Component{H jbc^ ps) = 

creafei4fomicCompon€nf(^, a<om""(frc),p5)) A 
((3 / : LINK • 6c = link{l)) => 

CreateNewComponent{H ^bc^ps) = 

create LinkComponent{ H , link^{bc),ps)) A 
((3 5 : seq BASE^COMPONENT •be - eompositei s)) =► 
CreateNewComponent{H ^be.ps) ^ 

ereateCompo$iteComponent[H , composite"" {be), ps)) 

2.4 Deleting A Component 

In de)tting a component we must ensure that we remove any links whose 
specifiers resolves to that component. 

DeleteComponent : HYPERTEXT x UID HYPERTEXT 

DeleteComponent = (A ^ : HYPERTEXT; uid : UID • 
(/i ' : HYPERTEXT \ 3 uirf^ : F t/iZ) | 
= {uid} U linksTo{H,uid) • 

H' .components = H ,components \ H Mccessor^uids^ A 
W .accessor = uirf5 ^ H. accessor A 
W.resolver = H.resolver^ uids)) 

2.5 Modifying Components 

In modifying a component we require that its associated information remain 
unchanged, that its type (atom, link, or composite) remain unchanged, and 
that the resulting hypertext remains link consistent. 
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ModifyComponent : HYPERTEXT x UID x COMPONENT 

HYPERTEXT 



V/f : HYPERTEXT] uid : t//Z); c' : COMPONENT • 
3 c : COMPONENT; W : HYPERTEXT \ 
c =: H .accessor {uid) A 

H'xomponents = H .components \ {c} U {c'} A 
H' .accessor = H .accessor ^ {uid ^ c'} A 
H' .rtsol zr = H.resolver A 
in/o(c') = tn/o(c) A 
ctypeConsistentc' A 
linkConsistent • 
Modify Compon€nt{ H ^ uid ^c) = ^T' 

2.6 Retrieving A Component 

To retrieve a component, given its UID, means just to have the returned 
value of the accessor function. 

getComponent : HYPERTEXT x UID COMPONENT 

: HYPERTEXT; uid : • 
getComponent{H ^uid) = H .accessor{uid) 

Given a UID which happens to represent a link, there exist operations 
which return either a source or destination specifier for that componenl. 

2.7 Attributes 

We introduce functions to both get and set the value of a given attribute (if 
it exists) for a given component. 

AttributeValue : HYPERTEXT x UID x ATTRIBUTE VALUE 

y H : HYPERTEXT: uid : UID: a : ATTRIBUTE • 
(3 c : COMPONENT \ c = H Mccessor{uid) • 
AttributeValue{H ,uid^a) = attribute$(c){a)) 
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SetAttributeValue : HYPERTEXT x UID x ATTRIBUTE x VALUE 

HYPERTEXT 

SetAttributeValue = 

(Ai7 : HYPERTEXT] uid : a : ATTRIBUTE] 

v. VALUE* 
in H' : HYPERTEXT | 3 c, c' : COMPONENT • 
c = fl'.accessor(u»(i) A 
c' = mod»/y/l<tn6utc(c,a,t;) A 
H' .components = H .components \ {c} U {c'} A 
H'. accessor - H. accessor ® {utc/ 1-» c'} A 
H'.resolver = H.resolver)) 

There is also a function which returns the set of all component attributes. 

AllAttributes : HYPERTEXT -> F ATTRIBUTE 

V// : HYPERTEXT • 

yl/M«n6u<es(ff) = {a : ATTRIBUTE | 3 c : COMPONENT • 
a 6 dom(a«n6u<es(c))} 

2.8 Anchors 

It is sometimes useful to know the link components which are associated 
with a particular anchor. The function LinksToAnchor returns the set of 
link component uids associated with a particular anchor id for a particular 
component id. 

LinksToAnchor : HYPERTEXT x UID x ANCHOR-ID - F UID 

LinksToAnchor = 

(A^ : HYPERTEXT] u : UID] aid : ANCHOR JD • 
{lid '.UID\3 lids : F UID \ 

lids = linksTo{H,u) A lid e lids • 

aid € {anchorSpecs 0 H .accessor){lid))) 
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3 Simple Runtime Layer Model 



3*1 An Overview of the Runtime Layer 

The fundamental concept \a the runtime layer is the instantiation of a com- 
ponent. An instantiation is a presentation of the component to the user. 
Operationally, an instantiation should be thought of as a kind of runtime 
cache for the component. A *copy' of the component is cached in the in- 
stantiation, the user views and/or edits this instantiation, and the altered 
cache is then ^written' back into the storage layer. Note that there can be 
more than one simultaneous instantiation for any given component. Each 
instantiation is assigned a unique (within session, see below) instantiation 
identifier (IID). 

Instantiation of a component also results in instantiation of its anchors. 
An instantiated anchor is known as a link marker. This terminology is con- 
gruent with that used in Intermedia, where the term "anchor" refers to an 
attachment point or region and the term "link marker"refers to the visible 
manifestation of that anchor in a displayed document. In order to accomo- 
date the link marker notion within the model, an instantiation is actually 
a complex entity containing a base instantiation together with a sequence 
of link markers and a function mapping link markers to the anchors they 
instantiate. A base instantiation is a primitive in the model that represents 
some sort of presentation of the component to the user. 

At any given moment, the user of a hypertext can be viewing and/or edit- 
ing any number of component instantiations. The runtime layer includes an 
entity called a session which serves to keep track of the moment-by-moment 
mapping between components and their instantiations. Specifically, when a 
user wants to access a hypertext, he or she opens a session on that hyper- 
text. The user can then create instantiations of components in the hypertext 
(an action known as "presenting" the component). The user can edit these 
instantiations, can modify the component based on the accumulated edits 
to the instantiation (an action known as "realizing" the edits), and finally 
can destroy the instantiation (an action known as "unpresenting" a compo- 
nent). When the user is finished interacting with the hypertext, the session 
is closed. 

In the model, the session entity contains "^he hypertext being accessed, 
a mapping from the IIDs of the session's current instantiations to their 
corresponding components in the hypertext, a history, a runtime resolver 
function, an instantiator function, and a realizer function. At any given 
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moment, the histor; is a sequence of all operations carried since the last 
open session operation. In the present version of the model, this history is 
used only in defining the notion of a read-only session. It h intended to 
be available, however, to any operation that needs to be conditionalized on 
preceeding operations. 

The session's runtime resolver function is the runtime version of the stor- 
age layer's resolver function. Like the resolver, it maps specifiers into com- 
ponent UIDs. The runtime resolver, however, can use information about 
the current session, including its history, in the resolution process. The 
storage resolver layer has no access to such runtime information. For exam- 
ple, a specifier may refer to "the most recently accessed component named 
'xyzzy'". The runtime resolver is responsible for mapping this specifier into 
the lilD matching this specification. The storage layer resolver would not 
be able handle this specification. The runtime resolver is restricted to be a 
superset of the storage layer resolver function; any specifier that the storage 
layer resolver can resolve to a UID must be resolved to the same UID by the 
runtime resolver. 

At the heart of the runtime model is the session's instantiatorlnnciion. 
Input to the instantiator consists of a component (UID) and a presentation 
specification. The instantiator returns an instantiation of the component as 
part of the session. The presentation specification is primitive in the model, 
but is intended to contain information specifying how the component being 
instantiated is to be "presented" by the system during this instantiation. 
Note that the component itself has a presentation specification from the 
storage layer of the model. This presentation specification is meant to con- 
tain information about the component's own notion of how it should be 
presented. It is the responsibility of the instantiator function to adjudicate 
(by selection or combination or otherwise) among the presentation specifi- 
cation passed to the instantiator and the presentation specification attached 
to the component being instantiated. The model in its current form does 
not make this adjudication explicit. 

The instantiator function is the core of a the present component op- 
eration. Present component takes a component specifier (together with a 
session and a presentation specification) and calls the instantiator using the 
component UID derived from resolving the specifier. Present component 
in turn is the core of the follow link operation. Follow link takes (the IID 
of) an instantiation together with a link marker contained within that m- 
stantiation. It then presents the component(s) that are at the destiiiation 
endpoints (i.e., endpoints whose specifier has direction of TO) of all link(8) 
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that have as an endpoint the anchor represented by the given link marker. 
In the case where aU links are binary, this is equivalent to following a link 
from the link marker for its source. The result of following the link is a 
presentation of its destination component and anchor. 

The instantiator function also has an "inverse" function called the real- 
izer function which takes an instantiation and returns a (new) component 
that "reflects" the current state of the instantiation (i.e., including recent 
edits to the instantiation). This is the basic mechanism for "writing back 
the cache" after an instantiation has been edited. The component produced 
by the realizer is used as an argument to the storage layer modify com- 
posite operation to replace the component with the edited component. This 
operation is wrapped in the function called realize edits in the runtime layer. 

3.2 Formalization of the Runtime Layer 

The runtime model depends on the notion of an instantaiion which is the 
visual representation of some component. Each instantiation has a unique 
instantiation id from the given set IID. 

[IID] 

An instantiation consists of a base instantiation which "represents" a com- 
ponent, a sequence of link markers which "represents" the anchors of the 
component, and a function mapping link markers to anchor ids. 

[BASE^INSTANTIATION, LINK J4 ARK ER] 

INSTANTIA TION . 

base : BASEJNSTANTIATION 

links : seq LINK A RKER 

linkAnchor : LINKJUARKER -> ANCHOR^ID 

dom linkAnchor - ran links 



A user manipulates instantiations, so that there must be a way of map- 
ping from instantiations to components. The function variable instants in 
the SESSION schema defined below maps an instantiation id to a pair con- 
sisting of an instantiation and the UID of its corresponding component. 
The accessor function in the HYPERTEXT schema then maps these UIDs 
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to components. More than one instantiation may be associated with the 
same UID and hence with the same component. 

A hypertext is manipulated in a session which is model by the SESSION 
schema. The OPERATION free type names the various operations a user 
can perform during a hypertext session. 

OPERATION ::= OPEN \ CLOSE 

I PRESENT I UNPRESENT 

I CREATE] EDIT \ SAVE \ DELETE 

During a session, a user opens up one or more instantiations of hypertext 
components through which the hypertext may be modified. We use the term 
presents to denote opening up an instantiation on a component because the 
component is presented to the user by means of the instantiation. Instanti- 
ations are not only a function of the component which they represent, and 
two presentation specifiers — one implicitly from the component's complnfo 
and the other explicitly, either user given or from a Unk specifier — but also 
impUcitly of the "current" set of instantiations. The function instantiator 
which is part of the schema SESSION captures this relationship. In sav- 
ing the result of a series of edits, the reverse of the instantiator function is 
needed; we call this function a realizer function. It takes an instantiation 
and returns a component based on the current session. 

There are some component specifiers which can only be resolved at run- 
time. An example of such a specifier is "the last node visited." The storage 
layer should be independent of such component specifiers. We introduce 
the notion of a run-time resolver which is just an extension of the regular 
resolver function. Note that the invariants on anchors given in the schema 
for HYPERTEXT only apply to those component specifiers which are in 
the domain of H. resolver. Also the LinksToAnchor function wiU not give 
those Unks with component specifiers resolvable only at run-time (not in 
the domain of H .resolver) — these additional links must be captured in the 
run-time layer. 



SESSION 

H : HYPER'^EXT 

history : seq OPERATION 

instants : IW m {INSTANTIATION x UID) 

instantiator : x PRESENT-SPEC — INSTANTIATION 

realxzer : INSTANTIATION - COMPONENT 

runTimeResolver : COMPONENT-SPEC UID 

head{history) = OPfA^ 
Vu.(/ : f//Z); p5 : PRESENT-SPEC \ 
uid 6 dom ^.accessor • 

realiz€r{in$tantiator{uid,ps)) = H .acc€ssor{uid) A 
H.resolver C runTimeResolver 



- ^SESSION 

SESSION 
SESSION' 

ij: history' = #/ii5<ori/ + 1 
instantiator' = instantiator 
realizer' - realizer 



A session begins with an existing hypertext (storage system) and a clean 
instantiation slate. 

openSession . 

SESSION 

hypertext? : HYPERTEXT 

H = hypertextl 
history - {OPEN) 
instants = 0 



Because there are several operations which can open up a new instan- 
tiation, we introduce the following function which opens up a set of new 
instantiation on an existing set of component. 
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openComponents : 

SESSION X ? {SPECIFIER x PRESENT ^PEC) 
SESSION 

\/S -.SESSION] specs : ^{SPECIFIER x PRESENT^SPEC) • 
^S' -.SESSION; iids:f IID\ 

newlnstants : IID m {INSTANTIATION x (///)) | 

S'.H = 5.^ A 

S'.runTimeResolver = S .runTimeResolver A 
S'.history = S.history " {PRESENT) A 
S".jas(anis = 5.jas^an«s ® newlnstants A 
#j»(is = # specs A nds n dom S. instants = 0 A 
dom ncu'/nsfan^s = I'tds A 
(V s : specs • 

3 iid : iids\ uid : VID\ 

cs : COMPONENT-SPEC\ 
ps : PRESENT^PEC\ 
inst : INSTANTIATION \ 

cs = {first{s)).componentSp€C A 

ps = secon(i(s) A 

uid = S.runTimeResolver{cs) A 

j'nsi = 5.»ns<Gn/ja^or(m'(i,ps) • 

neu'/ns<an<5(»i(i) = {inst, uid)) • 
openComponen^'.(5, specs) = 5' 



presentComponent 

^SESSION 

spec! : SPECIFIER 

presentSpec'^. : PRESENT-SPEC 

eSESSION' = 

openComponenisie SESSION . { { spcci , pnsentSpcc? )}) 



We can also follow a link from a given link marker in a given instantiation 
and present all the components for which the associated link(s) has(have) 
specifiers with a '^TO" direction. There may be more than one link involved 
because there may be more than on'd link associated with a particular anchor. 
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—followLink 

^SESSION 
iid? : IID 

linkMarkerl ; LINKJ^fARKER 

3 aid : ANCHOR^D\ links : F LinkComp; 

specs : V {SPECIFIER x PRESENT^PEC) | 
aid = {first{instants{iidl)))MnkAnchor{linkMark€rl) A 
links = HMCcessor^LinksToAnchor{n, 

S€Cond{instants{iidl)), aid)l A 
y?rs<jspec4 = {s : SPECIFIER \ 3 linkc : LinkComp | 

/m*c 6 /»n*s • 5 e r3in{link^{bas€{linkc))).specifiers} A 
(V5 : specs • ifirstis)). direction - TO A 

second{s) = {first{s)).presentSp€c) • 
eSESSION' = 

openComponents[e SESSION , specs) 



Opening up a new instantiation on a newly created component is mod- 
eled by the newComponent schema. 

_ newComponent 

^ SESSION 

rompouent : COMPONENT 
baseCompI : BASE^COMPONENT 
psi : PRESENT^PEC 
presentSpfcl : PRESENT^PEC 

history' - history (CREATE) 

{H\ component) = Create N ewC omponent{ H , baseCompI , psi) 
3uid : UID; inst : INSTANTIATION; iid : IID | 
iid ^ dom instants • 

inst - instantiator{uid , present Spec"!) A 
uid = H' .accessor'" (component) A 
instants' = instants 6 {nrf ^ (inst, uid)} 



The schema unPresent models the removal of an instantiation. 
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unPresent . 

A SESSION 
iid? : IID 

JlJf ^ 

history' = history ^{UNPRESENT) 
instants' - [iidl] ^ instants 



Instantiations can be modified by editing them. Editing an instantiation 
does not cause a change in its corresponding component. An explicit save 
operation is required to save the result of an edit (or many edits). 

editHnstantiation 

A SESSION 

instantiation? : INSTANTIATION 
iid? : IID 

H' = H 

history' - history " {EDIT) 
iidl € dom instants 
instants' - instants^ 

{iidl >-* {instantiation'! , second{instants{ iidl)))) 



realize Edits • 

ASESSION 
iidl : IID 

history' = history " {SAVE) 
instants' - instants 

3c: COMPONENT; inst : INSTANTIATION; uid : UID \ 
inst = first{instants{iidl)) A 
uid - s€Cond{instants{iidl )) A 
c = r€alizer{inst'\ • 
//' = ModifyCornpcnunt( H, uid, c) 



To be complete we must allow a component to be deleted. Since a 
component is identified by its instantiation, the component to be deleted 
must have been instantiated. We also must remove any other instantiations 
for that component. 
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_ deleteComponent . 

^ SESSION 
iid? : IID 

history' •= history ^ {DELETE) 
iidl 6 dom instants 

3 uid : UID | uid = second{instants{iidl)) • 
H' = Del€ttComponent{H , uid) A 
instants' - [iidl) ^ instants 



A session finlly ends when it is closed out. Notice that the default is not 
to save the results of any changes to instantiations. 

_ closeScssion 

^SESSION 

H' = 1 

history' = history (CLOSE) 
instants' - 0 



We can model a read-only SESSION with the following schema: 

_ RE A D.ONL Y SESSION . 

SESSION 

{SAVE, CREATE, DELETE} n ran history = 0 



4 Conformance with the Reference Model 

One reason to have a reference model for hypertext is to try to answer the 
ascertain whether a purported hypertext system actually warrants being 
called a hypertext system. So, g'.ven an actual hypertext system how do we 
show that it meets, or is conformant with the model? The best guidance for 
answering this question comes from the VDM experience under the heading 
of data reification as described, for example, in Chapter 8 of Cliff Jones' 
book [13] on software development using VDM. First, we must exhibit total 
functions, called reineve /unctions which map the actual types and functions 
from given (actual) hypertext system to each of the following types and 
functions of the model. We must also demonstrate adequacy - that there 
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is at least one actual representation for each abstract value. Obviously, the 
retrieve functions must satisfy the invariants which are given for the data 
types and functions. An informal way of saying this is that everything which 
is expressible or realizable in the model must be expressible or realizable in 
the actual system. 

In actuality our model is much more powerful than necessary. In partic- 
ular 

• By admitting multi-way links and links to links in the model, we put 
a fairly heavy burden on any implementation. 

• Many hypertext systems do not have the notion of composites. 

• Some hypertext systems, such as KMS, do have not have links with 
both an explicit source and destination. Thus requiring discrimination 
amongst all the values of type DIRECTION is too much. 

We are currently working on a "minimal" model which address the above 
items and others as may be necessary. 

The following list summarizes the given sets (base types), abstract types, 
functions, and operations which must have actual realizations in a hypertext 
system conforming to the model. 

1. GivenSets. 
UID 

COMPONENT_SPEC 
PRESENT-SPEC 
ANCHOR-ID 
ANCHOR-VALUE 

ATOM 

ATTRIBUTE 

VALUE 

IID 

BASE-INSTANTIATION 
LINK-MARKER 

2. Abstract types. 
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DIRECTION 
ANCHOR 
SPECIFIER 
LINK 

COMP_INFO 

BASE^COMPONENT 

COMPONENT 

HYPERTEXT 

INSTANTIATION 

OPERATION 

SESSION 

3. Storage layer functions, 

CreateNewComponent 

DeleteComponent 

ModifyComponent 

AttributeValue 

SetAttributeValue 

AllAttributes 

LinksToAnchor 

4. Runtime layer operations (schema^). 

openSession 

presentComponent 

followLink 

newComponent 

unPresent 

editlnstantiation 

realizeEdits 

deleteCompontnt 

closes ession 
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5 Concluding Remarks 

Development of the Dexter model is stiU in its very early stages. As discussed 
in Section 4, the model as currently stated is far more powerful than any 
existing hypertext system. The provisions for n-ary Unks and for composite 
nodes, for example, are intended to accomodate the design of future hyper- 
text systems. No existing system that we have examined includes both n-ary 
links and composite nodes. The result is that no existing system 'conforms 
to' the model in the sense that it supports aU of the mechanisms that the 
model supports. The solution to this problem is to make some mechanisms 
'optional', resulting in a family of interrelated models that support differmg 
sets of optional mechanisms. The weakest model, for example, would have 
no composites and only binary Unks. The strongest model would be the 
Dexter model in the present form. Conformance to the model could then be 
conditionalized on the exact set of mechanisms supported. Systems would 
be compared on the Vas's of the set of mechanisms that they do support. 

A related issue involves a number of consistency restrictions that the 
present model imposes. For example, when creating a link the model re- 
quires that all of its specifiers resolve to existing components. This restric- 
tion prevents the creation of Unks that are MangUng' from the outset. The 
model does not, however, include any restrictions that prevent the creation 
of dangUng Unks via the deletion of linked-to components. This restriction 
adequately represents the consistency guarantee of KMS. But its is overly 
restrictive for Augment, which aUows creation of initially dangling Imks. In 
contrast, its is not restrictive enough for NoteCards and HAM which pre- 
vent dangling links at aU times. As in the case of mechanisms, restrictions 
of this sort wiU have to be made optional in the model. Conformance to the 
model can then be conditionalized on appropriate choices of restrictions. As 
in the case for mechanisms, systems can compared on the basis of the set of 
restrictions that they enforce. 

The model has yet to be compared in detail to the hypertext systems 
it is designed to represent. Clearly, a necessary step in the development 
of the model is to formally specify (in Z) the architecture and operation 
of a number of 'reference' hypertext systems using the constructs from the 
Dexter model. These reference systems should be chosen to represent a 
broad spectrum of designs, intended application domains, implementation 
platforms, etc. This enterprise would provide valuable feedback regarding 
the adequacy and completeness of the model. In particular, it wiU help 
asess whether the model provides sufficient mechanisms for representing the 
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<hyp«rt«jit> 

<coapontnt> 

<typ«> uxt </typ«> 

<uid> 21 </uid> 

<dttt> This is soM .'It </dat&> 
<«jnchor> 

<id> 1 </id> 

<loc*tion> 13 </loc*tion> 

</anchor> 
</coapontnt> 
<coapontnt> 

<typ«> t«xt </typt> 

<uid> 777 </Qid> 

<d«ta> This is sons othtr text </datt> 

<anchor> 

<id> 1 </id> 

<loc*tion> 13-19 </location> 
</«Achor> 
</coBpontnt> 
<conpon«nt> 

<typ«> link </typ«> 
<uid> 661 </uld> 
<specifi«r> 

<co«ponsnt.uid> 21 </co«pon«nt.uid> 
<»nchor.id> 1 </ajichor.;d> 
<dir«ction> FWH </dirsction> 
<\sp«cifitr> 
<sptcifitr> 

<co«pon«nt,uia> 777 </coBpon«nt„uid> 
<ftnchor.id> 1 </*nchor.id> 
<dir«ction> TO </dir«ction> 
<\sptcif isr> 
</conpontnt> 
</hyp«rt«t> 



Figure 5: Example of a trivial interchange format derived from the model. 

important (common) abstractions found in the reference systems. It vill 
also provide feedback on the 'naturalness' of the model, i.e., on whether 
the specification of the reference systems in Dexter terms feels 'natural^ 
or whether the abstractions found in certain systems must be excessively 
massaged to fit into the Dexter abstractions. 

Despite its early stages of development, the model has already been 
useful in developing hypertext interchange standards. As described in the 
panel on interchanging hypertexts at the Hypertext 89 Conference [16], a 
number of efforts have been started to operationalize the abstractions of 
the Dexter model in the form of interchange formats. Figure 5 shows an 
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example of one such format. This format was used for experimenting w.. 
the interchange of hypertexts between NoteCards and HyperCard. As can 
be seen from the figure, the format is a fairly straightforward rendering of 
the entities found in the Dexter model into a SGMLish syntax. This format 
is by no means a well-developed interchange standard. But it does suggest 
that the Dexter model provides a good basis from v/hich to develop such 
standards. In fact, because the model is an attempt to provide a well-defined 
and comprehensive model, it is an ideal basis for developing a comprehensive 
standard for interchanging hypertexts between widely differing systems. 
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Abstract 

In this paper wc present multiple views on the issue of standardi- 
zation of Hypermedia systems that operate over a global hetero- 
geneous infomiation network. To aid our analysis we introduce 
a reference model that captures the information flow and the 
information control aspects from the viewpoint of tlic user. This 
model is then used to focus the analysis of Hypenncdia systems 
from a variety of perspectives, such as overall resources, network 
communication, interface building, and application writing. 
Based on our analysis we conclude that at this time, the com- 
ponents of Hypermedia systems that are ready for standardiza- 
tion arc not necessarily Hypermedia-specific. Moreover, we 
strongly believe that the Hypcmiedia-spccific aspects of these 
systems are not yet ready for standardization and we question the 
wisdom of ever standardizing certain Hypermedia specific com- 
ponents such as the user interface or the navigation tools. In 
addition, we conjecture that it may be desirable to standardize a 
generic set of tools that can be used to build these components so 
as to guarantee that the access to the infomiation stored in future 
Hypemiedia systems will not be impaired. 
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ANGLES ON STANDARDIZATION 



Intrinsic to the quest for standardization is the desire to make artifacts designed by different peo- 
ple in different places at different times compatible in relation to some predefined tasks. If we 
ask why one should attempt to standardize HyperText and Hypermedia technologies, we should 
look for the answer in efforts to combine pieces of information, text, graphics, still images, 
audio, video, animation and the like, which were created by different people in different places 
at different times. From this perspective it follows that it is reasonable to consider such standard- 
ization efforts only if we are willing to view the system as operating on a very large heterogene- 
ous network. 

Multimedia is a very complex artifact. It requires large amounts of resources and human 
involvement. Because of its potential as a new medium in which the human can seek, express 
and control knowledge, human interface considerations are of crucial importance. Much of the 
complexity involved in running the support hardware and software that make Hypermedia sys- 
tems a reality must remain hidden from the human and should proceed automatically. This 
implies the smooth and efficient tiansfer of information and control between many machines, 
each with its own capabilities for communication and information handling. Furthermore, it 
implies that the overall speed of the composite system should remain mostly unaffected by the 
global configuration of the various information sources and conduits to enable synchronization. 

The standardization of an artifact as complex as Hypermedia involves the standardization, or at 
least a thorough understanding, of the evolutionary trends existing today in the Hypermedia sup- 
porting technologies. Any attempts to freeze a version of a rapidly evolving system should be 
carefully engineered so as to guarantee uninterrupted progress. Therefore, one of the more 
important challenges is to decide which aspects of Hypermedia need to become a standard and 
which aspects are better off left alone. This decision should be based on a model of the func- 
tionality of the system, a model flexible enough to allow unexpected technological develop- 
ments. To illustrate this point let us consider two extreme scenarios for Hypermedia functional- 
ity. In the first scenario a single user is running a standalone application on a workstation. In 
the second scenario a user is running a shared application, which includes real-time communica- 
tion via broadband networks with other users and witii a variety of information ga . to dis- 
tributed data sources. Undoubtedly, the complexity of the issue of standardization .s impli- 
cations on information sharing are of different proportions in the two scenarios. In the first case, 
siandaidization must guarantee the compatibility of applications in many present and future 
environments. In the second case, standardization will guarantee complete information sharing 
across authors, users and machines. It is the second scenario which can benefit the most from 
standardization and at the same time is in the most fragile developmental phase and hence 
requires special handling. 

There are at least three reasons to embark on standards efforts. First it may be valuable to come 
to some agreement on a Hypermedia independent environment which will support this brand of 
computation. Second, standards may focus on the representation of data objects use in Hyper- 
media applications. And third, a standards effort might concentrate its energy providing a stan- 
dard human interface for applications that are browsing and inlbmiation retrieval intensive. 

With respect to the first point, a standard reference model which supports Hypermedia almost 
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certainly shares many, if not all, its attributes with reference models for most other applications. 
It may be useful, however, for Hypermedia practitioners to determine where, in a layered refer- 
ence model Hypermedia applications exert most of their impact. Later in this paper we outline a 
general reference model to facilitate discussion of this sort. 

Hypermedia applications are intimately concerned with data objects of various types and their 
interrelation. Because of their complex linking structure and multiple media flavor, Hyperrnedia 
applications, in all likelihood, require that data objects have detailed and explicit representations. 
Rich and flexible standard representations will be of great value to Hypermedia implementers in 
matters of exchange and authoring. It is also the case, however, that these very same object^ 
(e.g. image, video) and their underlving representations are also critical to many other classes of 
applications where exchange is important but has nothing to do with Hypermedia. Therefore, we 
question the prudence of Hypermedia-based object presentation standards. It would seem that 
Hypermedia practitioners should, again, consider the unique impact that hypertext applications 
might have on current and emerging object presentation standards efforts. We offer some con- 
jectures in this regard in the context of a reference model. 

While the defining characteristic of Hypermedia is its linking structure, its most often cited 
benefit is as an aid to human intellect. It may be reasonable then, lor Hypermedia practitioners to 
look for standards in the human interface to realize tiiis cognitive benefit. We conjecture that this 
route is at best premature and at worst naive. A standard Hypermedia human interface is prema- 
ture simply because there does not exist very much solid information about the sorts of Hyper- 
media design features that people find helpful. This state of affair makes it virtually impossible 
to code high level standards which could sensibly and practically apply to the multiplicity of 
potential Hypermedia applications. Readiness aside, such a standards quest may not be prudent. 
The target domain of an application often changes fundamental qualities of its interface. Given 
the complexity of Hypermedia application domains, it may be more prudent to build highly 
stereotype applications optimized for the communication and problem solving needs of a partic- 
ular domain rather than a vanilla consistent interface that does not accommodate the rich varia- 
tion in Hypermedia applications. 

In this paper, we center our discussion around a view of the Hypermedia system from the user's 
perspective. If we follow the infonnation and control as they flow from the user's terminal to tiie 
actual database, we cross at least eight functional levels. These levels are described in the next 
section, followed by an illustration of their descriptive power in two examples of prototype Mul- 
timedia systems. This illustration is followed by a discussion of the Hypermedia system from 
other perspectives and the implicatir ns of this decomposition into levels on standardization. 



A REFERENCE MODEL FROM THE USER'S VIEWPOINT 

Like many other dynamic systems with a high degree of complexity. Hypermedia can be viewed 
from multiple perspectives. Each perspective reveals a dimension along which hierarchical 
description levels can be stacked and interdependencies between structure and function revealed. 
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Level 6 


File System 








Virtual File System 
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Virtual 






Presentation Objects 


Interprocesses 


Broadband 


Level 3 




Communication 






Dialogue/Applications 


Mechanism 


Network 


Level 2 


Virtual Termmal 






Level 1 


Actual Terminal 







Figure 1 

Six Plus Two Level Reference Model Describing the Passage of Information and Control 
From the User at the Actual Terminal to the Actual Information Source. We View 
Level 3 and 4 as the Only Hypermedia Specific Levels. 



Imagine the way a Hypemiedia system looks from the perspective of the user. From this per- 
spective, both information and control are conveyed through layers of interpretation until they 
reach their destination which, in this case, is an arbitraiy collection of actual file systems created 
by arbitrary authors and located at remote sites which may be unknown to the user. We chose to 
separate the path of information and control into eight independent layers, each with its own set 
of primitive operations and data elements. Consequently, implicit to the construction of this 
reference model is the assumption that the functionality of the overall system is decomposable. 
However, keep in mind that many complex artifacts are only nearly decomposable, namely, their 
actual implementation involves "mixing" of levels due to strong pragmatic considerations. 
Therefore, we consider this model an idealization which serves as a general guideline during 
system design and evaluation. 
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In Figure 1 we introduce the eight level model and rr present it as a "six plus two" level model. 
This is because, the virtual interprocess communication mechanism and its actual network 
implementation can be involved in the information transmission process anywhere along the 
path between the actual terminal and the actual file system and hence could not be placed in any 
particular location on the stack. 

Undoubtedly, the reference model, at the level of detail shown in Figure 1, may describe any 
interactive distributed computer system. This raises the question of where do we perceive the 
Hypermedia specific components of the system to reside. In attempting to answer this question 
one may realize that any computer system, when examined very closely, exhibit many of what 
one may consider at least Hypertext specific characteristics. For example, the Unix® file system 
provides much of the functionality of a Hypertext system, without, perhaps, a sylized user inter- 
face. We will return to this point shortly, after we briefly review the levels shown in Figure 1. 

The bottom two levels in Figure 1 describe the terminal and the virtual terminal. Like all virtual 
devices, the virtual terminal provides a level of description that is implementation independent. 
The primitive operations comprising the virtual device description are implemented in every 
device to the best of that device's actual capabilities Like all virtual devices, it represents an 
additional level of processing of information, which is the price one must pay for flexibility. 
With the virtual terminal level of description, dialogues (applications) can be constructed (level 
3) that are implementable on the virtual terminal and which have as primitive operations user 
interaction activities. The dialogue kvel is the "information browsing" level and the value of 
separating it from the virtual terminal level is that it enables the application writer to tailor the 
interface to the applications and to the targeted user community in a terminal independent 
fashion. The level of description of the Presentation Objects (level 4) contains packets of infor- 
mation stored in a form that can be displayed by any interface. The database containing these 
objects is represented in level 5. Notice that operations at each level in the stack except the top 
three are represented in terms of primitive operations of the level below it. In the case of the top 
three levels, which are separated in Figure I by a double line, the order is reversed. This is 
because the presentation objects are implemented in terms of the virtual file system, and the vir- 
tual file system is implemented in terms of the actual file system. This reversal property is an 
essential part of any description scheme that, similar to our scheme, follows the path of infcima- 
fion and control between the user and some real data - the scheme has to start with a real object, 
namely the terminal, and end with a concrete implementation of data. We will not to elaborate 
on the actual implementation levels of the file system. 

Which of the above levels are part of the Hypermedia application and which levels describe the 
environment? In our work we view the Presentation Objects and the interface (levels 3 and 4) as 
part of Hypermedia and they will be discussed in more details in the next section. We view the 
other descripfion levels as representing the supporting infrastructure for global Hypermedia sys- 
tems and for most other applications. Currently, this supporting infrastructure is not standard- 
ized, e.g., the virtual terminal and the virtual file system are not standards, and b; adband com- 
munication networks are far from standardized. Given this view, one may quesfion, as we did in 
the first section, the wisdom of standardizing Presentation Objects and aspects of interfaces 
before, at least, stable sketchs oi u standard virtual terminal and a standiird virtual file system are 
agreed upon. 
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In the next section we will examine standardization issues from various viewpoint, but before 
doing so we illustrate the value of the reference model presented in Figure 1 in two examples. 
To demonstrate how the reference model provides structure to the functionality of Hypermedia 
systems, we look at the following two systems from the domain of Customized Electronic Infor- 
mation Delivery. Customized Electronic Information Delivery systems provide users with vari- 
able information streams. Regarding the level of editing of the information items delivered by 
such systems we can imagine two extremes - highly stylized, long, magazine like, articles, and 
short raw articles directly from the news wires. The Electronic Magazine (Judd and Cruz, 1989) 
is an example of the former, and the Passive Information Grazing system (Bussey et al, 1989) is 
an example of the latter. 



user 



The Electronic Magazine research prototype displays multimedia articles through a stylized 

interface providing tlie user with navigation and orientation tools. In addition, the magazine 
contains multimedia authoring tools and a mark-up language. Figure 2 presents a glance at the 
Electronic Magazine from the perspective of the reference model presented above. 



Actual Tenninal 



Virtual Terminal 



Dialogue/Applications 



Presentation Objects 



Virtual File System 

Actual File System 

Virtual InterProcess 
Communication Mechanism 



Actual Network 



Sun-3 Color Monitor 

SunVie\Avindow System 

Multimedia Interface 
Navigation tools 

Stylized Multimedia Articles 
SGML Based Mark-Up Language 
Authoring tools 

Linked Database of Multimedia Articles 

Unix® Files 

None 

None 



Figure 2 

Description of the Electi onic Magazine Prototype 



SwtView is a irademaric of Sun Microsyitems, Inc. 
Unix is a registered irademark of AT&T. 
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None 
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EXPANSE (see Bussey et al 1989). 



Figure 3 

Description of the Passive Information Grazing Prototype 



The research prototype of the Passive Information Grazing System provides the user with a con- 
tinuous stream of multimedia information through a simple interface. Before reaching the user 
the information passes through a filter eliminating articles that according to a personalized user 
profile, are of no interest to the user. Figure 3 shows a brief overview of the system from the 
perspective of the reference model. 



INTERSECTING DIMENSIONS AND STANDARDIZATION ISSUES. 

Hypermedia systems require a - ry rich infrastructure. Even though they may be viewed as 
mere application programs, they jut a severe strain on existing computational and communica- 
tion resources. They push today's technologies to their limits. Therefore, when it comes to stan- 
dardization it may be ill advised to consider Hypermedia as a standalone application and not as a 
system that is closely coupled with the development of its infrastructure. For example, from the 
viewpoint of resources, the actual perfomiance and capabilities of the system are affected by 
resources available at each of the levels described in Figure 1. Parameters such as network relia- 
bility and speed, information storage capacity, CPU "horse power", and terminal capabilities 

X Window Systm is a trademark of MlT. 
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may play a major role in defining the future shape of Hypermedia applications. 



Keeping the Hypermedia dependencies on its infrastnicture in mind, we will proceed to discuss 
Hypermedia and its standardization from the view point of the Hypermedia application writer. 
According to the reference model presented in Figure 1, the application writer is equipped with 
terminal independent and file system independent authoring tools. In our framework, the appli- 
cation writer is responsible for producing the Presentation Objects, and the User Inteiface. The 
Presentation Objects are the key elements of the system. A collection of them resides in the vir- 
tual file system, and they are displayed on the interface. Aspects of their structure are given in 
Figure 4. 



Object Description: 
links 
attributes 
authorization 
displaying methods 

Object Presentation: 
envelope 
body 

. — - ^ 



Figure 4 

The Structure of Presentation Objects. 



It is important to note that in the context of the current discu' .he Presentation Objects pro- 
vide a way to carve-up meaningful presentable pieces of multimeuia information. This is due to 
the fact that the Presentation Objects contain sufficient specification to guarantee that they can 
be displayed, classified, stored, retrieved, and filtered in a global Hypermedia system. Also, 
they essentially represent an "Object Oriented Approach" to Hypermedia information represen- 
tation and management. 

We view Presentation Objects -js consisting of two main parts - the Description part and the 
Presentation part. The Description part contains the links that the object has to other objects, 
attribute of the object such as its size and the resources it needs, information about authorizafion 
and authoring tools, and methods to display it. The Presentation part of the object contains the 
envelope and the body. The envelope contains preview infonnation about the body of the 
object, e.g. title, abstract, video clip etc. The body is (a pointer to) the content of the object. 

The level of the dialogue captures user interface and session manugcmcnt issues. Some of its 
functiuimiity is given in Figure 5. 



Current Status 
Available Objects 
Open Objects 

Navigation Tools: 

within object navigation 
between objects navigation 

Authoring Tools 
Displaying Tools 



Figure 5 

The Level of the Dialogue (Applications) 



CONCLUSIONS 

We are now in a position to consider our central problem here: What do we need to standardize 
in order to guarantee information sharing in Hypermedia systems that operate over a global 
heterogeneous information network? 

The standardization of the virtual terminal, the virtual file system, and the virtual interprocesses 
communication mechanism should come first. These standards will guarantee that any applica- 
tion can run on the standard virtual terminal irrespective of the terminal and the actual file sys- 
tem used, and that any network can be used for communication given that it can emulate the vir- 
tual network. Regarding the Hypermedia components, the Presentation Objects should be the 
next in line for standardization. However, as stated in the opening section, since at the present 
time we still cannot assess the potential multimedia capabilities of the future we must wait for 
the above standards before we consider freezing the form of the Presentation Objects and their 
database. 

If we now look at the situation where all the levels in Figure 1 are a standard except the applica- 
tion level we immediately realize that there is no point in standardizing it. The fact that the lev- 
els above and below it are a standard impose a strong enough constraint that produces a standard 
set of tools to build the software at that level. This approach sets the functionality of Hyper- 
media but not its "look and feel". We believe that at this point it is still inappropriate to stand- 
ardize "look and feel" of Hypermedia because not enough is known about the relationship 



between the users' cognitive skills and personal preferences and the benefits that Hypermedia 
has to offer to them. Therefore, at this point, a standard user interface may defeat the purpose of 
user-friendliness and may make personalized access to information impossible. 
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Abstract 

In this paper a formal specification of an abstract model of hypertext i.s presented. 
TliP Vienna Development Method (VDM) is used in this specification. Expcritnices 
with a prototype hyi>ertcxt system and studies of other existing hypertext systems are 
captured in this forma' specification. Basically datamodol of hypertext is suggested. In 
this model three main abstract data types of hyi>ortext are formally defined: nodes, 
networks and structures. The abstract data typ«s are applied to the concepts of object- 
oriented databases and a "hyperbase" is defined. 



1 Introduction 

Hypertext is becoming a well-known technique for information represent,ation and nianagomcnt. Differ- 
ent research projects show that hypertext has many potential applications that are just beginning to 
be explored: textbooks, dictionaries, encyclopedias and software engineering [Hypertext 1989]. At the 
Hypertext'89 Conference a wide range of hypertext products were presented. They all covered many dif- 
ferent aspects of hypertext. But, they had one thing in common. When it comes to means of interchange 
and communication between these systems they are all doomed to fail. 

In this jungle of different systems, publishers of hypertexts mu.st worry about portability of their works 
between different hypertext systems to ensure that they don't depend to much upon the suce.ss of one 
system. The users of hypertext systems must worry about the supply of hypertexts or use of liyitertext 
organization of long-lived project documentation stored in a specific hypertext system, making the data 
inaccessible for other (hypertext) systems. 

Steps toward interchange and communication between open hypertext systems must be based on 
formal and abstract models of hypertext to which all existing and hopefully future systems can be 
related. In the last few years an increasing number of pa|)ers on hypertext and its ai)plication has 
been published. Only a very small part of this work has been concerned with the formal treatment 
of hypertext. There is clearly a need for a more formal approach to hypertext since one can claim 
that hypertext is driven by user interface and implementation considerations [llalasz k ('onklin 1989]. 
Looking through the Hypertcxt'89 Proceedings [Hypertext 1989] one will find dissai^ointing few pa- 
pers on the more formal and abstract aspects of hypertext. However, attempts to present more for- 
mal models of hypertext have appeared [Delisle k Schwartz 1987] [Garg 1988] [Stotts k Furuta 1989] 
[Consens k lendelzon 1989]. This paper presents a formal model of hypertext, using the Vienna D(;vcl- 
opment Method (VDM) [Bj0rner k Jones 1982] [Jones 1986]. VDM supports the top-down development 

*A veraion of this paper emphasizing a fonnal spcciliration methodology aiul with t)ifr<;rciit ic-,;hni'al details, hut in- 
evitably overlapping in the dataniodd fa^ct with the pr.,spnt paper, is h.-ing prr«ont.rd at VDM'yO aiid p.ihlish. d in the 
conference proceedings by kind permission of the Programme Comittce and the editors. 

t Author's Present Address 

'A part of the work has taken place at the Technicfil Univei-sity of Denmark 
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Figure 1: A Snapshot of the Prototype 



of software systems specified in a notion suitable for formal verifikation. The specifications are based on 
a datamodel using high-level types as set, list, map and cartesian products. Function specification are 
written in predicate logic, using i)re^conditions stating the properties that the inputs must satisfy, and 
^osticonditions which states the relationship of inputs to outputs. 

At Briiel & Kjaer^ we have developed a prototype of a hypertext system. The prototype was developed 
on a SUN3 workstation^ using an expert system shell called AIM'^. The prototype was written partly 
in ART*s rule-based language and COMMON LiSP [Steele 1984] using a window based user interface, see 
figure 1. The prototype has fulfilled several aims. First it has given the developers a feeling of what 
hypertext is all about, by working with the prototype. Secondly the ideas of hypertext has easily been 
communicated to non-experts and potential users. 

Our experiences with this prototype and studies of hyr>ertext systems as HyperCard, Hyperties, Nep- 
tune, KMS, Nodecards, etc. is capfiired in the formal specification presented in section 2 and section 
3 m this paper. In section 2 the datamodel of hypertext is presented by domain equations giving a 
forthal definition of the primitives of hypertext, introducing the three main concepts: nodes, links and 
structures. In section 3 the datamodel is extended with a set of operations in an object-oriented way, 
defining abstract datatypes of nodes, links and structures. Our experiences with this formal model and 
future work are discussed and concluded in section 4. Detailed pre- /i)ost- specifications of the specified 
operations can be found in appendix A. 

2 Developing a Basic Datamodel of Hypertext 

The hypertext datamodel has evolved on basis of the experience with our prototype and our general 
knowledge to the domain. The model will include the concept of nodes and their interior, links between 
nodes and between fields and buttons inside the nodes. Different kinds of links are described: A^-ary 
links, second order Unks and active links. Additionally the idea of having structures organizing nodes in 
e.g. hierarchies, is introduced. 

In the following a datamodel of hypertext is developed through stepwise refinement. Initially the 
meaning of hypertext is defined as a database that has active cross-references, allowing the user to have 
nonsequential access to a text thereby making the reading process nonlinear. A hypertext can be modelled 
as a set of nodes and a collection of links wh ere the nodes are documents and the links are cross-references. 

^Briiel &: Kjaer Industri is a company that designs and maiuifactures high-precision electronic measuring instruments. 

^Sim Workstation is a registered trademark of Sun Microsystenvs. Inc. 

^ ART (Automated Reasoning Tool) is a registered trac ^rk of bifcrence Corporation. 
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Figure 2: Example of linked nodes 



1.0 Hypertext :: Nodes x //ints 

2.0 yVodes = "chunks of information" 

3.0 Links = "cross-references" 

2.1 Nodes - Units of Information 

An information fragment in a hypertext is called a node. Thus, hypertext is made up of a collection of 
distinct named information fragments. Conceptually this information fragment usually describes a single 
concept or topic. The names may be assigned explicitly by the user or they can be assigned automatically. 

In some hypertexts it might be necessary to divide the nodes into several different types: document, 
illustration, annotation, etc. Thus, it must be possible to add attributes and attribute values to nodes. 



4.0 
5.0 
6.0 



Nodes = Nid Tjt (Node x Attributes) 
Node :: "information" 
Nid :: TOKEN 



2.2 Links - the Glue that Holds Hypertext Together 

A connection between two nodes is called a link. When a link is activated, say by a mouse click, one 
can jump to the node the link points to. A hypertext network is made up of a collection of uniquely 
named links. Links can be to transfer the reader to an new topic, provide access to an annotation 
or footnote, show a reference and so on. Conceptually a link is directed, i.e. it points from one node to 
another, having an origin called the anchor and an end point called the destination. However this does 
not mean that links are unidirectional, that is, the passage is not only one-way. One can always pose the 
question: who points to me? 

In figure 2 one can see an example of a document consisting of a section, two subsections and a 
reference list. The section is connected to it.s subsections through node to node links. All three items 
link to a common reference list. The section node might contain the text of the introduction to the two 
subsections, and the nodes of the subsections, contains the text of the subsections. Below the concept 
of linking is restricted to only concern connections between entire nodes. In section 2.4 the model is 
extended to include links between the contents of one node and another node. 

A hypertext system may have only one type of link or it may have several types. The link type can 
reflect the type of information it is pointing to, making it possible for the user only to view links of a 
certain type. Different types of links in a document could be references to related articles or reviewers 
annotations. To represent this variety of linktypes, they can be attributed in the same manner as for 
nodes. 
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Pbonr 



P^ratly 



Figure 3: Example of the use of schema 



7.0 Links = Lid yjt Link 

8.0 Link :: Connections x Attributes 

9.0 Connections :: Anchor x Destination 

10.0 Anchor, Destination =: Nid 

11.0 Lid :: TOKEN 



An important point in hypertext is the support for collaborative work. If several people are reviewing 
and annotating the same hypertext, they all use the common network made by the author of the document. 
To this common network each individual can add a personal subnetwork reflecting their own need for 
referencing across the common network and including references for their annotations. Looking at other 
persons sub-networks, one can inspect their annotations, possibly realizing that further comments on 
specific topics are needless, thus saving time in a review process. This does not remove the need for 
attributed links. One may still need to add individual information to the link, like the time when it was 
created, why it was created, etc. 



12.0 Networks = Nxvid yn (Links x Attributes) 
13.0 Nwid :: TOKEN 



2»3 Slots - the Interior of the Node 

Conceptually the node can cover a wide range of applications, i.e. representing a chapter or section in a 
document, function definitions in the source text of programs, organizing information on notecards, etc. 
Obviously there is a need for a substructure in the interior of the node. 

A slot is a kind of template for the contents of the nodo. It can be compared to the record datatype in 
programming languages. A node has a collection of unique named slots, each having sorne kind of textual 
content. An example of the use of schema in a node is shown in figure 3. In this example information on 
individuals is organized in an archive. For each person exists one basic "card*' carrying a specific set of 
information: name, address, phone and family. "Cards" can be annotated and one can make references 
between the "cards**. In the family slot, one can mention the spouse and make a link to his/her "card**. 
In our model theser "cards'* are equal to the node. 

Slots can be connection points for links. As anchors and destinations they are identified by the node 
in which they are embedded and their name. 

14,0 Node zz Slid yn Slot 

15.0 Slot :: String x Atiribiitcf 

16.0 String :: CIIAir 

17.0 Anchor, Destination = ... \ (Nid x Slid) 

18.0 Slid :; TOKEN 
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Figure 4: Example of buttons 
2.4 Buttons and Fields - the Referential Mechanism 

In this section buttons and fields are introduced.. They are the fundamental components of the referential 
mechanism, one of the most powerful properties of hypertext. Links connecting entire nodes and slots 
have already been introduced. Now the concept of linking is extended to cover source and destination 
points inside the nodes. Pragmatically this covers the referential use of links in a hypertext. 

A handle is a part of the text inside the slot to which a link can be attached. This makes it possible 
to establish connections between the contents of one node and another node. A handle is defined as a 
consecutive sequence of characters in the textual contents of the slot. More precisely by its character 
position in the text and the span in numbers of characters. 

When a link is anchored to a handle, that is, there is an outgoing link from a handel, the text span 
specified by the handle is called a button. In figure 4 it is shown that one can get from an actual reference 
in the text to the reference list. 

Fields are defined exactly in the same way as the buttons are. We have chosen to distinguish between 
these two of purely conceptual reasons, thus having fields as one of the possible end-points of links. 

The domain of connections is extended to include buttons and fields. From a connections point of 
view, a button or field is identified by the node and slot in which it is embedded and its handle in that 
slot. 



19.0 


Slot 


String x Handles x Attributes 


20.0 


Handles — 


Hid Trt Region 


21.0 


Region :: 


Position X Length 


22.0 


Position, Length :: 


No 


23.0 


Anchor — 


... 1 Button 


24.0 


Destination = 


... 1 Field 


25.0 


Button, Fields 


(Nid X Slid X Hid) 



To continue the example, the use of fields makes it possible to follow a reference not only to the 
reference list but to a certain entry in this reference list, see {ii;ure 5. Dei)ending on the user-interface 
the entry, i.e. the field, is accentuated. 



2.5 More on Links - iV-ary Links, 2nd Order Links and Active Links 

So far only binary links has been treated. Binary links are characterized by one link anchor and one 
destination point. They match the concept of navigatnig in a hypertext very well. That is, if one has an 
end-point of a link, there is only one way to go, if one choses to follow the link. 

For structural reasons it may be mor(^ appropriate to consider a more general concept of links. A^-ary 
finks have one or more link anchors and one or more destination points. In the model this means that 
a set of link anchors and destination points are bound to the same link. An example of iV-ary links is 
shown in figure 6. In this example three sections in a document refer to a certain article. Following 
the finks, one might first be directed to an entry in an annotated reference list, for reading an abstract, 
and then to the article itself. In this way the concept of A'-ary links forms the basis of following links 
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in several steps, that is being directed to a short description of the destination before actually arriving 
there. 



26.0 Connections :: Ancho r-set x Destination -sei 

Nodes, slots and fields have been discussed as destination points for links. Links pointing at links, 
called 2nd order links, can be used to point at a collection of connections. It might reflect that a link 
itself is of special interest, and that the reader after being guided to the link, can chose to study the 
anchor or destination of the link. Links are identified cis connection points by name of the network in 
which they are embedded, and their own name. 

27.0 Anchor, DesUnaiion- ... | (Nwtd x Lid) 

Active links are links that have anchors or destinations that are function denotations. That is, instead 
of having links pointing at fragments of text they contain a function. This function is to be interpreted 
when one is following the link. This kind of a link can be used to generate a view of the data it is anchored 
to. That could be the generation of a graphical representation of the data each time one is following 
the Hnk. A function signature is added to the domain of anchors and destinations. The domains of the 
arguments and the results of the function are not specified in any further detail. 

28.0 Anchor, Destination ... | Argttmen t-sei ^ ftesu/^set 
29.0 Argument, Result = ... 

2.6 Structures - the Organizers of Hypertext 

The hypertext in figure 2 represents the most simple organisation of a hypertext. This example of a 
hypertext is a set nodes connected by links. A hierarchy of nodes in a hypertext is another primitive 
example of organising an hypertext. It is a way of organizing information into meaningfull parts e.g. 
documents into sections and subsections. Figure 7 shows such a hierarchy of sections and subsections in 
a document. The user is usually free to define information structures in traditionally hypertext systems 
as they are needed. But, the novice user sometimes niay require guidance by the hypertext itself, or one 
may find ad hoc organisation of hypertexts potentially dangerous. The problem can be solved by using 
structures. 

Structures should prescribe an organization of nodes and networks. They can conceptually be com- 
pared to the domain equations in VDM, introducing sets, sequences, maps and the possibility of recursive 
definitions, e.g. tree data structures. The structures can form a basis for an algebra for structured hy- 
pertext documents [Giiting et al.]. 

The use of the set-structure has already been demonstrated and fits well into card-like hypertexts. 
The map-structure can extend this unordered collection of cards with a facility of direct access by user 
defined names. Sequences can be used to express interrelationships between nodes as the sequence in 




Figure 



5: Example of Fields 



ERIC 



-150- 




which they should be visited, e.g. chapters in a book. Defining these structures recursively, makes it 
possible to make tree structures of nodes. 

It should be empha-sized that it is not the nodes and networks themselves that are organized in these 
structures. The structures contains only the names of the nodes and networks. Hence it possible to reuse 
nodes and networks in several structures. E.g. one can think of a section or figure appearing in more 
than one book, and thus in several structures. 

Structures can be interpretatcd by filters, to make linear representations of the hypertext, e.g. on 
paper. A tree structure of a book should intuitively be interpreted by a filter in a top-down left-to-right 
manner, so that chapter one and the subsections of this chapter are written out before chapter two and 

so on . • u • I 

Structures are uniquely identified by their name. Each structure is characterized by havmg a col- 
lection of substructures, each organizing destinations into sets, sequences or maps. The substructures 
themselves have unique identities and can be destinations, thus making it possible to build more compli- 
cated structures. A structure has a root that can identify one of the substructures as being the root of 
the structure. 



30.0 


Structures 




Sid yrt (Struc'ure x Attributes) 


31.0 


Structure 




Subsid Trt ^ .^structure 


32.0 


Substructure 




Substruc X Attributes 


33.0 


Substruc 




Set 1 Seq \ Map 


34.0 


Set 




Destination-set 


35.0 


Seq 




Destination* 


36.0 


Map 




TOKENrrf Desiinalion 


37.0 


Anchor, Destination 




... \Sid \ (Sid X Subsid) 


38.0 


Sid, Subsid 




TOKEN 



2.7 The Attributes 

Attributes are basically a mapping between names of attributes and their values. The names of the 
attributes are user defined. The values of the attributes can be of a simple text or numerical type, 
but one can also expect structured types as known from the attributes of attribute grammars. Among 
attributes that should be mentioned are version numbers, time for creation, access rights, protection, etc 



39.0 Attributes = Attribute jrt Value 
40.0 Atiribuic :: TOKEN 
41.0 Value 



2.8 The Hypertexts - Bringing It All Together 

Basically the developed datamodel says that a hypertext is a collection of nodes and one or more networks 
connecting the nodes and a structure describing the organization of the parts that forms the hypertext. 
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Figure 6: Example of jV-ary links 
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Figure 7: Example of a hierarchy 

The networks represent the referential links, that is the explicit links conn-^-^ting two or more parts of 
ihe hypertext. The structures are organizing the nodes and the networks. One can say that there is 
a dualism between networks and structures in that structures represent a kind of organizational links 
between nodes in a hypertext. 

In tins way one can represent several liypertext applications in a collection of nodes, simply by letting 
the actual hypertext application apply a certain network and a certain structure to the nodes. Then 
actual buttons in a node are first resolved by the hypertext application when one or more networks are 
applied to it and the node will show deferent sets of buttons depending on the applied networks. Finally 
a hypertext is defined as: 

42.0 Hypertext :: Nodes x Networks x Structures 

This observation leads to the object-oriented approach to a model, defining the hyperbase in terms 
of abstract datatypes, as presented in the following section. 

3 An Object-Oriented Model 

Having seen the basic datamodel of hypertext it clearly seems to be an good idea to follow an object- 
oriented approach in the specification of the semantic functions. Nodes, networks, and structures should 
be defined as abstract datatypes. The domains of each of these datatypes has already been described in 
the previous section. 

In the following a simple model of an object-oriented database is presented. Based upon this model 
the operations of the abstract datatypes, as introduced by the datamodel in the previous section, is 
formally specified. 

3,1 An Informal Model of an Object-Oriented Database 

The clasj of an object is the abstract data type of the objects. 1 hus an object may be thought of as 
an instance of a particular class. The class defines the operations that can be applied to the object by 
an application. A class defines the set of operations applicable to all instances of that class in terms of 
names of operations and types of formal arguments and results. An implementation of a class provides a 
set of operation procedures implementing the set of operations defined by the class. The implementation 
encapsulates the data representation and the algorithms that are u.i^d to perform the operations. The data 
represention of an object is a collection of data that makes up the state of the object . The state is managed 
by the implementation and is only accessible by means of the operation procedures [Crawley 1986]. 

Below the basic domain of an object-oriented database is modelled as a collection of instantiated 
objects each having an unique identity. An instantiated object has a state that can be changed through 
the set of class operations. The domain of the state and the set of chiss operations are defined by the 
type definition of the class. 
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Figure 8: The Class Hierarchy 



43.0 Objedbase = Objid jft Object 

44.0 Object :: State x Opes 

45.0 State 

46.0 Opes = Opeid jrr Ope 

47.0 Ope = ylrffs^ State ^ (State x /ies^ 

48.0 Args, lies = ... 

3.2 An Object-Oriented Hyperbase 

Now the domain of hyperbases are applied to the concepts of object-oriented databases. The hyperl)ase 
covers basic operations on instances as the creation of new instances, basic object version management 
and object access control. 

An object-oriented hyperbase is in this way defined as a collection of uniquely named instances of 
three object types. Each instance has a state which type depends on the type of the object. The three 
applicable state type are node, network and structure, as defined in the datamodel. A set of operations 
are defined for each type. Furthermore each instance has a set of predecessors and successors, identifying 
the neighbours of the instance in the version chain. 



4U.0 


'Ily per Base 


= Objid fft Object 


50.0 


Object 


:; State x Operations x Attributes x Succ-acl x 


51.0 


State 


Node 1 Links \ Strtictun 


52.0 


Objid 


= Nid 1 Nwid 1 Sid 


5;}.o 


Op emit oils 


= Opetd TTt Operation 


54.0 


Operation 


= Arpument-acl ^ State ^ (State x Hesult-ai'l) 


55.0 


Opctd 


;: TOKEN 



3.2.1 Fundamental Oporations 

Th^ CrcatelnstanceOf o\)eT'dt'\on can make instances of the subclas.ses, that is, it can make node, network 
and structural objects, returning the unique names of these objects. These instances can be destroyed by 
the Destroylnstance operation. T'he collection of identities of instances of a given class can be collected 
by the De/OZ/ns/anccv operation. 

56.0 ObjectClass = NonKs | Nki works | STiiucrruiiKs 
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57.0 type : CreatelnsianceOf : ObjectClass ^ Ilyperbase ^ (Ohjid x Ilypcrbasc) 
*1 type / Destroy Instance : Ohjid ^ Ilyperbase ^ Ilyperbase 
♦2 type ; SetOflnstances : ObjectClass ^ Ilyperbase ^ Obit d'Svl 

3«2.2 Basic Object Version Maiigagement 

This set of functions refer to the version nianagcinoiit of the hy[)crba,se. Tho CreatcSucccssorOflnsiance 
creates a copy of a specified object instance. The idcMitity of the created object instance in added to 
the successor set of the specified instance, which identity on the other hand is added to the predecessor 
set of the new object instance. The predecessor and successor sets of an instance are found respectively 
by the PredecessorOflnstance and SuccessorOflnstance operations. The M ergelnstan ces opctRiion merge 
two objects into one object. 

58.0 type .' CreateSucccssorOflnstancc : Objid ^ Ilyperbase ^ (IlyperBasr x Objid) 

.1 type .' PredecessorOflnstance : Objid ^ Ilyperbase ^ Qfrn rf~set 

.2 type .' SuccessorOflnstance : Objid ^ Ilyperbase ^ Obji d-HcA 

.3 type .' Mergelnsiances : Objid x Objtd ^ Ilyperbase ^ (HypcrlJase x Objid) 

3.2.3 Object Access Control 

The Open operation are concerned with checking the access conditions of the instance before allowing 
access to the set of operations. The close operation reset the access conditions after they have been 
altered by a previous open. One has access to the operations of the hyperba.se objects through the 
OperaieOnlnstance function. The identity of the object instance and the name of the operation to be 
executed is passed to this function. 

59.0 type ; Open . ... 
.1 type ; Close . ... 

.2 type ; OperaieOnlnstance : Objidx Oyeidx A rQumtn l-^vl ^ Ihvevliase-* ( Hyyerliasex Resul i-scA ) 

3.2.4 Object Attribute Operations 

AddAitribute adds an named attribute to the set of attributes of the slot. Attril)Utes are removed by the 
j/Z^mot;ei4«n6«<e operation. Values are assigned to attributes by the AssignAtinhute opvvFii\on. Finally 
a value of an attribute is read by using ReadAHribuie. 

60.0 type; AddAttribuie : (Objtd x Name) ^ Node ^ Node 

.1 type ; RemoveAttnbule : (Objid x Name) ^ Node Node 

.2 type ; AssignAttribute : (Ohjid x Name x Value) ^ Node ^ Node 

.3 type ; ReadAttribute : (Objid x Name) ^ Node ^ Value 

3.3 The Three Object Classes of a Hyperbase 

The three object classes or abstract datatypes of a hyperbase represent the nodes, the networks and the 
struciures. 

3.3.1 A Node Class 

The objects of the node class are having zero or more slots. The operations are divided into three groups. 
The first set of operations is grouped around the schema of the node, and the second set is grouped 
around the end-point of links: handles and regions. The linai group of operations is the node attributes 
operations. 
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Slot Oporatious. The AddSlotopo.T&l\ou adds a new and empty slot to the node instance. The identity 
cf the new slot is returned to the user. The RcmoveSht operation can remove a slot and its contents 
from the node. One can u.se the IldumShts operation to get set of names of the slots allocated in the 
schema of a node instance. 

61.0 tYije; AddSloi : () ^ Node ^ (Node x Slid) 
.1 type ; RemoveSlot : Slid ^ Node ^ Node 
.2 type .- RetnrnSlots : () ^ Node ^ i7jrf;set 

Slot Browsing Operations. The contents of a specified slot can be delivered as a string of characters 
by using SlotVieiv. Sloilnsert is an example of an editing operation. One can use this operation for 
insertion of a string into a position in the contents of a specified slot. SlotDelete can be used to remove 
a specified portion text of the contents of a slot. 

62.0 tiine: S,oiVieiu : Slid - Node ^ STIUNG 

.1 txjie: Slotlnscrt : (Slid x STRING x Position) ^ Node ^ Node 

.2 type ; SlotDelete : (Slid x Position x Length) ^ Node ^(Node x Hid-jxt) 

Handle Operations. A handle can be added to a specified region of the contents of a .slot by the 
Addllandlc operation. The handle is given a unique identity which is returned to the u.ser. One can add 
several handles to the .same region, and regions can be overlapping. A handle is removed by using Remove- 
Ilandle. The names of the handle.s located in a slot are returned by ReturnPosittonllandles or^eTnlion, 
and the names of the handles at a specified position in a slot is returned by the ReiurnPosiiioiiHandles 
operation. The region specified by a handle is returned by the Geillandle operation. 

63.0 type : Addllandk : (Slid x Region)^ Node ^ (Node x Hid) 

.1 type ; Removcllandle : (Slid x Hid) ^ Node ^ Node 

.2 type .- RettirnSlotHandlcs : Slid ^» Node ^ Htd~_H£l 

.3 type .- RcturnPosttionHandles : (Slid x Position) ^ Node ^ Hid-^ 

A type .- GeiHandle : (Slid x Hid) ^ Node ^ Region 

The Slot Attribute Operations. AddAttnbuie adds an iiam<Ki attribute to the set of attributes of 
the slot. Attributes are removed by the /icmoi'e/l//nftH/e operation. Values are assigned to attributes by 
the AssignAttribute operation. Finally a value of an attribute is read by using ReadAtt.tbute. 

64.0 type ; AddAitnbute : (Slut x Name) ^ Node ^- Node 

.1 type .- RemoveAUnbnte : (Slid x Name) ^ Node ^ Node 

.2 type .- AssignAttribute : (Slid x Name x Value) ^ Node ^ Node 

.3 type .- ReadAtirihute : (Slid x Name) ^ Node ^ Value 

3.3.2 A Network Class 

The operations of the network class consists of six network changing operations and three querying 
operations. 

Network Changing Operations. The AddM operation adds a new and empty link to the network. 
The operation gives the link a unique identity which is return.'d to the user. A link i removed by the 
/Jcmort'/.tnil- operation. The anchors -ind destinations of the I'-nk in question, does not have to be empty. 
Anchors and destinations are added to a specified link by the two oper;itions: AddAnchor and Add- 
Desiinatwu. Removing anchors or destinations are done by the RemoveAnchor &i\d Remove Destination 
operations. 



65.0 iijie; AddLink : () ^ Links ^ (Links x Lid) 

.1 type ; RemoveLink : Lid ^ Links ^ Links 

.2 type ; AddAnchor : (Lid x Anchor) ^ Links ^ Links 

.3 type; RemoveAnctior : (Lid x Anchor)^ Links ^ Links 

A type; AddDestinaiion : (Lid x Destination) ^ Links ^ Links 

.5 type; RemoveDestination : (Lid x Desiinaiion) ^ Links ^ Links 

Network Querying Operations. The two querying operations IIavingAnch')r^nd HavingDcsttnalion 
are used to identify the linlcs of a certain network instance, that have the specified anchors/destination 
in common. The ^earf/>inA- operation reads the anchor and destination set of the specified hiik. 

66.0 type ; HavingAnchov : Anchor^ Links ^ Ltd-sci 

.1 type; HavingDestination ; Destination ^ Links ^ Lid-sct 

.2 type; ReadLvik : Lid ^ Links ^ f/lncAor^set x Destination^ncl) 

The Link Attribute Operations. AddAitrtbute adds an named attribute to the set of attributes 
of the specified link. Attributes are removed by the ReuioveAtlribute operation. Vahies are assigned to 
attributes by the AssignAttnbute o^ct&I'iou. Finally a value of an attribute is read by using Read Attribute. 

67.0 type ; AddAttribute ; (Lid x Name) ^ Links ^ Links 

.1 type; RemoveAttnbnte : (Lid x Name) ^ Links ^ Links 

.2 type; AssignAttribute : (Lid x Name x Value) ^ Links ^ Links 

.3 type ; ReadAttribute : (Lid x Name) ^ Links ^ Value 

3.3.3 A Structiu'al Class. 

The operations of a structure are divided into four groups. The first is concerned the more general 
operations on the structure, i.e. adding and removing substructures etc. The final tliree groups are 
concerned with the specific operations of the three tyi)es of substructures: sets, sequences and maps. 

Structure Operations A substructure can be added to the structure by using tlie AddSubstructure 
operation. A substructure is removed by RemoveSubsiructure. The of identities of the substructures 
pointing the specified destination is returned by the HavingDestination operation. Finally, one can get 
the type of a substructure by using the CktSubstructureType opeTal'wn. 

68.0 SubstructureType = Skt | Skquknck | Map 

69.0 tyjye: AddSubstructure : SubstructureType Stmcture ^ (Structure x Subsid) 

.1 type ; RemoveSubstructure : Subsid ^ Stmcture ^ Structure 

.2 type; HavingDestination : Destination ^ Structure ^ Subsid-Hcl 

.3 type; GetSubstructureType : Subsid ^ Stnictnre ^ SubstructureType 

Set Operations Tlie /IrfrfZ^cs/jna/JOH operation adds a destination to a set of destination. A destination 
element of a .set i removed by RemoveDestination. The HavtngDe.stinationSet operation can be used 
to find out whetlier a specified destination is in th«; set. Tlie .set of destinations is returned by the 
GetDestinationSet operation. One get the number of elements in the set by using the GctCardinality 
operation. 

70.0 typ*?' AddDestination : (Subsid x Destination) ^ Stnicturc Structure 
• I type; RemoveDestination : (Sub.sid x Destination) ^ Structure ^ Structure 
.2 tiifie; HavmgDestinationSet : (Subsid x Destination) ^ Structure ^ BOOL 
.3 type; GetDestinationSet : Subside Stnirture ^ Dc.sUnatton-wi 
.4 type; GetCardtualtty ; Subsid ^ Structure ^ Ny 



ERIC -156- j.| 



Soqiiencc Operations. One can insert a destination at the specified pasition ni the list by usnig 
the InseriDestination operation. Destinations positioned at a pasition greater or equal to the insertion 
point, are shifted one place. By the Remove Desitnaiion operation one can remove the destination at the 
specified position. The operation works in the opposite way of the inserting operation. Tlie operation 
returns all the positions of the specified destination in the sequence. The destination at the specified 
position is returned by GeWestinatton. GetLength returns the length, i.e. the number of destinations in 
the list. 

71.0 {mi: JnsertDestination : (Subsid x Destination x Noj - Structure Structure 
.1 tX2e; RemoveDestination : (Suhsid x No) Stnrlure StrHclurc 
.2 type.- Having Destination : (Snbsid x Destination) ^> Stmcture ^» l-ioz^ 
.3 tm.: GetDesiination : (Subsid x Noj ^ Stmcture ^ Destination 
A type ; GetLength : Subsid ^ Structure ^ No 

Map Operations. A new named destination is added by the AddDestination operation and rrmoyed 
by the RemoveDesiination. All the names of a specified destination can be found by IlavingDcsiiuatwn. 
One can get the destination identified by a given name by using the GetDcsiinaiton operation. The set 
of names bound to destinations is returned by GetDomain. 

7*2.0 tym: AddDestination : (Subsid x Name x Destination) ^ Structure ^ Structure 

1 txm: RemoveDestination : (Subsid x Name) ^ Sir^iciure ^ Structure 

.2 tmi: Havim, Destination : (Svbsid x Destination) ^ Structure ^ Name-js± 

.3 tine.' GetDesiination : (Suhsid x Name) ^ Structure ^ Destination 

A type ." GetDomain : Subsid ^ Structure ^ Name±cl 

The Structure Attribute Oporatioxis. AddAttribute adds an named attribute to the set of attributes 
of the structure. Attributes are removed by the RemovcAtlnbute operation. Values arc a-ssigned to 
attributes by the AssignAttnbute operation. Finally a value of an attribute is read by using ReadAiinbute. 

73.0 tyj)e; AddAttribute : (Suhsid x Navie) ^ Structure ^ Stnicture 

1 ixse: RemoveAttrtbute : (Suhsid x Name) ^ Structure ^ Stnicture 

.2 tiee.- AssignAttribute : (Subsid x Name x Value) Structure Structure 

.3 type ; ReadAtinbute : (Subsid x Name) ^ Structure ^ Value 



4 Conclusion 

One of the major decisions in the development of this model has been to separate the presentation and 
the browsing semantics from the model, and move them to the applications design. The applications 
should only operate on the hyperbase through the specified operations and the dataobjects should not 
be aware of the applications and their semantics. By adding the aspects of persistence to this object- 
oriented model we haye a model of an object-oriented database. In this way i-ssues on distribution 
basic version management and access control could be solved in the domain of object management 
systems. U is our hitention to combine this model with the european standard on portable common 
tool environments (PCTE) [Thoma.s 1989]. PCTE is a standard for object-oriented bases for software 
engineering environments. , , 

We are currently making a prototype of a liyperbase server based on the set of specihcations presented 
here. This prototype is developed in the object-oriented programming language C++. Different hypertext 
applications are being .leveloped for this server to show feasability of the model. 

With respect to the work on hypertext standardization, this model should be related to existing 
approaches to hypertext, to seek for commonality between dilTerent approaclie.s and to make progress 
towards a complete model. It is our opinion that a hypertext standard should be defined in terms 
of abstract datatypes, to retain a maximum of representational abstraction from the viewpoin of the 
hypertext applications. An open point in the model is the interchange mechanisms between different 
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hyperbases. The model has to be extended with some kind of protocol for the transfer of hypertexts from 
one base to another. 
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A Detail Specifications 
A.l An Object-Oriented Hyperbase 

74.0 t^.- CreaielnstanceOf : ObjectClass ^ Hyperbase ^ (Ohjid x Hyperbase) 



.1 
.2 
.3 
.4 
.5 
.6 
.7 
.8 
.9 
.10 



mzCreatelnstanceOffclass, ) ^class G {NoDKS, Networks . Structurks } 
JiSiSLCreatelnstanceOffclass, hyperbase) (objid, hyperbase')) ^ 
M objid 6 Objid \ dam hyperbase in 
cases class : 

NOQES hyperbase' - hyperbase U [w^Nrdfobjid) 

mkzObject([], NodcOpcratwns, [], {},{}], 

IjfKXWQRKS - hyperbase' = hyperbase U [li^Nwid(objid) 

tDkzObject([], LinksOperations, [], {}, {}]. 

Structures hyperbase' = hyperbase U \nA;:.Sid( objid) 

B^Objeci([], StructureOperattons, [], {}, {}], 
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75.0 type ; Desiroylnstancc : Objid ^ Hyperbase ^ Hyperbase 

.1 v>ie.DesiroyInstance(objtd, hyperbase) ^ objtd 6 ciom hyperbase 

.2 post-DcsiroyInsiance(objtd, hyperbase) (hyperbase' )) k 

.3 hyperbase' = [td ^ (\et mk^Objeci(siate, operations, ss, ps) = hyperbase(xd) in 

.4 r[^Objecl(siaie, operations, 

.5 (objtd e ss^ (ss \ {objtd}) U s^Succ(hyperbase(objid)), 

.6 T — ss), 

.7 (objtd e ps- (ps \ {objtd}) U s^Fred(hyperbase(objtd)), 

".8 T - ps)))] 

76.0 m^: SetOflnstances : ObjedClass ^ Hyperbase ^ Objid-jici 



.1 



^SetOfInstances(class, ) ^ class € {Nodes, Networks, Structures } 



.2 post- SeiOfInsiances(ciass, hyperbase)(objids)) ^ 

.3 cases dass : , , ,/. «r j / i - jn\ 

4 Nodes - objtds = {obpd \ ("i objtd e ciom hyperbase)fis^Node(obpd } 

5 N^ORKS ^ objtds = { objid I rV o6j.d € dom /.yper6ase;(ls,I.n*sr^'''J'rf;{} 

'.6 Structures o6j«d. = [objid \ (^ objtd e dom hyperbase)Cis^Sirndures(objid))} 

A.1.1 Basic Object Version Mangagement 

77 0 tYce.- CrealeSuccessorOflnstance : Objtd ^ Hyperbase ^ (HyperBase x Objtd) 
.1 ^CreateSuccessorOfInstance(objid, hyperbase) ^ objtd € iom hyperbase 

.2 vosi- CreateSuccessorOfInstance(objtd, hyperbase) (hyperbase' , objtd )) = 

3 \et objid' e Objid \ dom hyperbase \n / . • 

4 lei mkObjed(siate, operattons, attrs, ss, ps) := hyperbase(ob]xd) xn 

5 hyperbase' =hyperbase.+[objtdy^v^Objed(state,operattons,attrs ss U {oijid ), ps)\ 

6 U [objid'*-* Ob jed( slate, operations, attrs, {},{objtd})\ 

78 0 type; PredecessorOflnstance : Objtd ^ Hyperbase ^ Objtd±(:tt 

1 nTe.PredecessorOfInstance(objid, hyperbase) i objtd € dom hyperbase 

.2 D^t-Predece5sorO//»stancero6.;ni. hyperbase) (objtds) 4 oij.ds - ^Pred(hyperbase(obtd)) 

79.0 type ; 5uccesst.rO//nsiance .• 06jj(/ ^ Hyperbase ^ Objtd^sci 

1 nre-5«ccessorO//ns/ance('; ^ o/>j«d G dom /iyper6a.se 

.2 Dost-5«ccessorQ//nsfa«cero6j.rf, hyperbase) (objids) ^ objtds = ^Succ(hyperbase(obtd)) 

80.0 typ e." Mergelnsiances : ... 

A.1.2 Object Access Control 

81. C type ; Open : ... 

82.0 txiie; CVose . ... 

83.0 tXEe.- OperafeOn/nsiance .• ObjtdxOpctdx Argnmcni:^^ HyperBase^ (HyperBasex Result:^) 

1 pTG-Ove.jieOnlnstance (objtd, opeid, , hyperbase) ^ 

.2 objid e hyperbase A «peid G ^Opemtions(hyperhase(objxd)) 

3 Eost::0pera/e0n/nsiance('«6jid, ope.d, as, hyperbase) (hyperbase' , rs ) = 

.4 M mk::06jec/('s<a/e, operaiions, a/<rs, ss, psj = hyperbasc(objtd) in 

,5 let ('s/aie', rs') = operations(opexd)(as, stait) in 

6 (state' ^ uH -* , n 

.7 hyperbase' = hyperbase njkzObjed(state' , operattons, attrs, ss, ps)], 

,8 s/a/e'= nji -* hyperbase'-. hyperbase) 
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Object Attribute Operatioiu;. 



84.0 type ; AddAtiribute : ... 

85.0 type ; Remove AUribnte : ... 

86.0 type .- AssignAttnbule : ... 

87.0 type .- ReadAUribuic : ... 

A.2 The Thr e Object Classes of a Hyperbase 
A.2.1 A Node Class 
Schema Operations. 

88.0 ti£ie.- AddSlot : () ^ Node ^ (Node x Slid) 

.1 me-AddSlot() ^ T 

.2 mst- AddSloi(nodr.j(nndp' slid) ^ 

.3 let slid e Sltd \ doiii node in node' = node U [slid i-» mk-Slot(< >, [], [])] 

89.0 tvjie: RemoveSloi : Slid ^ Node ^ Node 
.1 m^RemoveSloi(slid, node) ^ slid g dom node 
.2 £ost^RemoveSloi(slid, node)(node') ^ node' = node \ {slid] 

90.0 RtlurnSlois : () ^ Node ^ Slid-imi 

.1 m^ReiurnSlotsO ^ J 

.2 2ost^ReturnSlois(node)(sltds) ^ slids = dom node 
Slot Browsing Operations. 



91.0 tx^: SlotViexu : Slid ^ Node ^ String 

.1 21^SlotVicw(sltd, node) ^ slid e dom node 

.2 j^gst^SlotViewfslid, node)(texi) ^ 

.3 \^n^Sioi( siring, , ) = node(sUd) in text = stnng 

92.0 ti£e.- Sloilnseri : (Slid x String x Position) ^ Node ^ Node 

•1 mSzSlotlnsert(slid, , position, node) ^ 

.2 slid e dom node A (\ei mk^Slot(str, , ) = nodc(slid) m 0 < position < \en sir ) 

.3 Qgst:^SlotInsert(slid, s, position, node) (node' ) ^ 

.4 (\skm^Sloi(text, handles, attrs) - node(sltd), 

•5 inkzSlot(text' , handles', attrs') - node' (slid) in 

.6 text'= < iext[i] \ 0 < t < positiou> ~ * ' < tr.xt[i] | position < i < kn text > A 

.7 (y hid G dom handles) (Ivt (p, I) = handles(hid), (p', I') = handles' (hid) ia 

•8 V + \ < position -* p' = p f\ I' = I, 

•9 ;) < position > V + I - v' = p M' ~ I + length, 

position > p p' = p + length A I' = I 
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93.0 t}m: SloWeleie : (Slid x Position x Length) Node ^(Node x Htd-jetJ 

.1 oTe-SlotPeletefslid, position, , node) ^ 

.2 € dom node A 

.3 (let mk-Slotfstr, , ) = node(sUd) \n 

A 0 < position < ]en s/r A position + length < kn str) 

.5 vost- SlotPektefslid, position, length, node) (node', hids) ^ 

.6 (let mk-Slotftext, handles, attrs) = node(slid), 

7 mk-Slotftext' , handles', attrs) = node' (slid) m , , . . 

8 text'^ <text[i] \ 0<i< positio> " <text[i] \ position + length < » < len texi> A 
'.9 hids = {hid I (^ hid G dom handles) (kt (p, I) = handles(hid) in 

jQ position < p A position + length > P + I)} )) ^ 

,11 Aom handles' = dom handles \ hids ^ 

.12 posj/ion < p A posj/jon + length <p p'= P - position /\ I = I, 

.13 position < p A position + length < p+l - p' = p-positionM' =l-(position+length-p), 

14 p < posihon A pos»/Jon -f length < p+l ^ p' = p A I' = I - length, 

.15 p < position A p-f/ < position + length - p'= P A / - (p+l - position), 

.16 p+l < position — p'=nA/'-/ 

Handle Operations. 

94.0 tiiie; AddHandle : (Slid x Region) ^ Node ^ (Node x Hid) 

A ine;^Add}h'ndle(slid, ix^Region(pos, length), node) ^ 

.2 slid e dom node A Clet mk:,Slot(str, , ) = node(slid) in pos+length < len sir ; 

.3 vost- AddHandle(slid, region, node)(node', hid) ^ . ,, • 

4 let mk.Slot(text, handles, attrs) = node(slid), hid e Hid \ dom handles m 

.5 node'= node + [slid ^ mk-Slot(tfxt, handles U [hid ^ region], attrs)\ 

95.0 ti2e.- RemoveUandle : (Slid x Hid) ^ Node ^ Node 

.1 DTe-RemoveHandle(slid, hid, node) ^ , ., » 

.2 slid e dom ""rfe A riot rnk^S/o/r ,/.anrf/es, j - node(slid) m hid e dom handles ) 

.3 vost- RemoveHandle(slid, hid, node)(node') i 

4 let mk-Slot(text, handles, attrs) = node (slid) in 

.5 node'= node + [slid ^ mk- Slot (text, handles \ {hid' ], attrs)] 

96.0 txm: ReturnSlotHandles : Slid ^ Node ^ Hid-_set 

.1 vte-ReturnSlotHandles(slid, node) ^ slid G dom node 

.2 ^osUReturnSlotHandles(slid, nodc)(hids) ^ hids = dom ^Handles(node(slid)) 

97.0 tiEe; ReturnPosiiionHandles : (Slid x Position) ^ Node ^ Htd-set 

1 \)Te-ReturnPositionHandles(slid, position, node) ^ 

.2 slid G dom node A Clet mk^Slot(str, , ) = node(slid) in position < kR str ) 

.3 pofit- ReturnPositionHandles( , position) (hids) ^ 

.4 ]^ i^Slot( , handles, ) = nodeislid) Ui 

.5 /i»d.s = {/nd G dom handles \ (\£i xx^Rcgxon(p,l) = handles(hid) in 

6 P — position < p+l)] 

98.0 type .- GetHandle : (Slid x Hid) ^ Node ^ Region „ //^il 

.1 ^GeiHandle(slid, hid, node) g .sW G dom node A /..d G dom H^Handles(ndoe(slid)) 

.2 vost- GetHandle(slid, hid, node)(rcgion) ^ 

.3 let mk-.y/o<r , handles, ) = node(sltd) in region = handles(hid) 
The Slot Attribute Operations. 

99.0 type .- AddAttribute : ... 
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100.0 type ; RemovcAiiribute 



101.0 typ e.- AsfiignAUribute . ... 
102.0 ti^Ee: ReadAttribute : ... 

A.2.2 A Network Class 
Network Changing Operations. 

103.0 t^: AddLtnk : () ^ Links ^ ( Links x Lid) 
.1 VTe-AddLtnk() A j 

.2 liostzAddLtnk(ltnks)(ltnks', Itd')^ 

.3 let hd'e Lid \ dom Imks in itnks' = links U [Itd'^ n}kzi^ink(mk^Connections({ },{ };,[];] 

104.0 tYEe: RemoveLink : Lid ^ Links ^ Links 



.1 



l^RemoveLink(lid, links) ^ lid € dom /m/ts 



106.0 
.1 
.2 



.2 ^^RemoveLmkflid, links) (links' ) i /m/ts'= /m)ts \ {/jrf} 

105.0 tiEe.- /Irfrf/lnc/ior ,• ('Zirf x Anchor) ^ Links ^ Z,iu)l-s 

.1 };>Te:iAddAnchor(lid, , links) a /,</ g (jom /m/ts 

.2 ^^o^AddAnchor(lid, anchor, links) (links' ) ^ 

.3 kk]I^Link(r^Conneciions(as, ds), aitrs) = Itnks(lid) in 

A links' = links + [lid n)k:^Link(mk::Connections(as U {anchor}, d.s), attrs)] 

txi)e.- Remove Anchor : (Lid x Anchor) =^ Links ^ Links 
QTe::RemoveAnchor(lid, anchor, links) ^ 

lid e dom links A (let mk^Conneciions(as, ) = links(lid) m anchor e as ) 

.3 ]^ost2RemoveAnchor(lid, anchor, links) (links') ^ 

.4 Mn^Link(mk:^Connections(as, ds), attrs) = links(lid) in 

.5 links'= links + [lid ^ mk^Link(m\izConnections(as \ {anchor}, ds), attrs)] 

107.0 AddDestmation : (Lid x Destination) ^ Links ^ Links 

.1 m^AddDesttnation(lid, destination, links) ^ lid € dom links 

.2 sostzAddD€siination(lid, destination, links)(links' ) ^ 

.3 kt mk^Link(mk^Connections(as, ds), attrs) = links(lid) in 

A links' = links •/■ [lid mk::Link(mk:.Connections(as, ds U {destination}), attrs)] 

108.0 tyj)e: Remove Destination : (Lid x Destination)^ Links ^ Links 

.1 m^Remove Destination (lid, destination, links)(links') ^ 

.2 lid e dom links A (let mk^ Conn ec/jons(^ = links(lid) in rfesana/ion € rfs ; 

.3 ^^RemoveDestination(lid, destination, links) (links' ) ^ 

.4 let n^i-Link(i\^Connections(as, ds), attrs) = links(lid) in 

.5 links' = links + [lid ^ mk^Link(nikzConvection.s(as, ds \ {destination}), attrs)] 
Network Querying Operations. 

109.0 type.- H aving Anchor : Anchor^ Links ^ Lid-jH 

.1 vte- Having AnchnrQ ^ J 

.2 ^^i:^H aving Anchor (anchor, links) (lids) ^ 

.3 lids = {lid e dom Imks I let ink-.Link(niLConncc(ion.s(as, ), ) = Imks(lid) in ajichor £ as} 



110.0 type ; HavingDesiinaiion : Desiinaiion ^ Links ^ Lid:set 

.1 pTe-HavinQDesUnaUon() ^ T 

.2 Dost- Havin<]DesiinaUontdesiinaUon, links)(lids) ^ 

.3 lids = {/idedom /initsllei u^Link(Tr^ConnecUoni(, ds), ) = links(lid) m desitnaiton € 

111.0 type ; ReadLiiik : Ltd ^ Links ^ (Anchor±^ x Desitnaiion-jtt) 

.1 \)Te-ReadLink(hd, links) ^ ltd € dom Itnks 

.2 post- ReadLinkfUd, links)(as, ds) i mk-Link(mk^Conncdions(as, ds), ) = Itnks(ltd) 
The Link Attribute Operations. 

112.0 type .- AddAUribuie ; ... 

113.0 type .- RemoveAUribuie : ... 

114.0 type .- AssignAiiribuie : ... 
115.0 type ; ReadAiiribuie : ... 

A.2.3 A Structural Class. 
Structure Operations 

116.0 tige; AddSubsiruciure : Subsir-udureType ^ Sirudure ^ (Sirudure x Subsid) 
.1 ];)Te- AddSubsirudure() ^ T 

.2 Dosi- AddSubsirudurefiype ,sirudure ) (sirudure' , subsid) i 



.3 let subsid e Subsid \ dom sirudure m 
.4 sirudure' = sirudure U [suftsid 

.5 mk^Subsirudure( (cases iype : 

.6 Set — mk-Sei({ )), 

.7 Skquence — mk- SeQ(< >), 

.8 Ma£ - inkzMap([]) ), [])] 



117.0 type .- RemovtSubsirudure : Subsid ^ Sirudure ^ Sirudure 

1 pTe-RemoveSubsirudure(subsid, sirudure) i sutiid € dom sirudure 

.2 Eoski2emove5«6s<nic<«re|'s«6st<i, sirudure) (sirudure') ^ sirudure':= sirudure \ {substd} 

118.0 type ; Havtng Desitnaiton ; Desiinaiion ^ Sirudure ^ Subsid-j^ 

.1 QTe- HavinqDesiinaiionQ ^ T 

.2 2ost::^HavingDesitnaiion(desiinaiton, sirudures)(substds) ^ 



.3 suftstds - {s«6st(i I s«6s«d € ism sirudure) 

A M rnk^5«6s<«c<ure|'s«6s<rwc, ; = sirudure(subsid) in 
.5 cases su6s<ruc ; 

6 Set — ► desiinaiion € s, 

.7 Sequence — desiinaiion e elems s, 

.8 Map -* desiinaiion € jm^ sj} 
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119.0 type ; GctSuhsiructureTypc : Subsid ^ Structure ^ SubstnidureType 

A \yT±^GciSxihstructureType(suhsxd, structure) ^ subsid G dom structure 

.2 lio^GetSubstructureType(subsid,sti^icture)(type) ^ 

.3 let wk:^Substu:ture(substruc, ) = stncture(subsid) m 

.4 type == (cHsc^ substrnc : 

.5 mk^Setf) ~* Skt . 

.6 xuk-ScqQ ^ Srqurncr . 

.7 mk-MapQ Map J 

Set Operations 

120.0 type; AddDestination : (Subsid x Desiination) ^ Sti^ciurc ^ Sti-ucture 

A VXQ' A ddI)csiinaijon(subsid, , structure) ^ 

.2 subsid € dom structure A 

.3 Jet xr\k' Substruciure(subsiruCj ) = structurcs(subsid) in \S::^Set(substruc) 

A vosi- AddDestination(substd, destination, structure) (structure^ ) A 

.5 let mk- Sub structure ( sub struc. aitrs) = sirucUire( subsid) m 

•6 structure^ = structure -f [subsid y-^mk^SnbstructurefsubstrucUidestination] ,attrs)] 



121.0 tYj2e.- Remove Destination : (Subsid x Destination) ^ Structured Structure 

A ine^RemoveDcsti7iaiion(subsid, destination, structure)^ 
,2 subsid € dom structure A 

.3 lot in\i'' Substructure(substruc, ) - structures(substd) \n 

.4 \s^Set(substruc) A destination G suhstruc 

.5 l}o^RemoveDesttnatton(siibsid, destination, structure)(structure^ )^ 

.6 Jot \uk- Suhstructure(substruc, aitrs) - stmciure(suhsid) in 

.7 structure^ = stmcture + [subsid ^n\k- Sub structure(subsiruc \ {destination}, attrs)] 

122.0 type; HavingDestinaiionSet : (Subsid x Destination) ^ Structure ^ BOOL 

.1 V re- Ha vinaDesiinat lo n Se i (s u bs i d, , structure)^ 
.2 subsid G dom stimcture A 

.3 let wkz,Substructure (substrnc, ) = struclures(subsid) in WSet(snbstruc) 

A i}os\^HavingDcstinationSet(substd, destination, structure ) (b ) ^ 

.5 ]^ n^Substructurt(substruc, ) = struciure( subsid) \n b destination G substrnc 

123.0 type ; GetDestinationSei : (Subsid) ^ Structure d Desttnation-f^ci 

A ine- GetDesiinationSei (subsid, btmcturt) ^ 
.2 subsid G dom structure A 

.3 let n^Subsiructure(subsiruc, ) = slructnres(subsid) in ]^Sei(substruc) 

A voiiU GetDcs(in(iiionSet(subsid, sfniciure)(dsj ^ 

.5 let mk^Substructure(substruc, ) = structure(subsid) in ds = substruc 

124.0 ixBl' GetCardinaliiy : Subsid ^ Structure ^ No 

A vrc' GetCardv^ality (subsid. simcture) ^ 
.2 subsid G dom substructure A 

.3 Jet suhstruc = tLSnbslruc(snbstrHrtures(suhstd)) in \^Set(substruc) 

A i20Hlz,GetCardtualily(subsid, structure s)(cd) ^ 

.5 let n^Subsirncture(subslruc, ) = structure(substd) iJX cd - card substruc 
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Sequence Opeiatious. 



125.0 type ; InseHDestmation : (Substd x DesUnation x ^o) ^ Structure ^ Structure 

A vTe-InsertDestinationfsubsid, , index, structure) ^ 

.2 substd € dom substructure A 

.3 let mk-Substructure(snbstruc, ) = structure (substd) in 

.4 is^Seqfsubstruc) A 0 < index < \ex\substruc 

,5 post- InsertDestinationfsubsid, destination, index, structures) (structure')^ 

.6 let mk-Subsiructurefsubstrnc, atirs) - strticiurefsubsid) in 

.7 structure^ =structure-h[subsid ^n^SubsirucAure(<substruc[i]\0<i<index>' 

.8 <destination > ^ <{iubstruc[i] \ index < i < len substruc>, attrs)] 

126.0 type ; RemoveDestination : (Subsid x ^o) ^ Structure ^ Structure 

.1 oTc- Remove Desiinationfsiibsid, index, structure)^ 

,2 subsid € dom substructure A 

.3 let rak' Substructure (substrnc, ) = structure (subsid) in 

.4 iS::^Seq(substruc) A 0 < index < leusubstruc 

.5 yiosi- RemoveDcstinationfsubsid, index, strucUire)(struciure\) ^ 

.6 let mk- Substructure (substruc, attrs) = structure(subsid) \n 

.7 structure^ =:siructure-h[substd »— ^ 

.8 n\k'Substructurc(<snbstruc[i] \ 0 < i < index> ^ 

.9 <subMruc[i] \ index < i < len substruc>, attrs)] 

127.0 type ; HavingDestination : f^ubsid x Destination) ^ Structure ^ 

.1 vTe- HavinqDcstinationfsubstd, destination, structure) ^ 

.2 subsid € dom substructure A 

.3 let mk-Substmcturefsubstruc, ) - structure(subsid) \ms^Seq(substi^c) 

.4 poHi- HavinQDestinationfsubsid, destination, structure) (indices) 4 

.5 let mJ^Substructuretsubstruc, ) - structure (subsid) in 

.6 indices = {t\ ("j € jnd substruc)(substruc[i] - destination)] 

128,0 ixue: GetDestination : (Subsid xNo) ^ Structure ^ Destination 
VTe-GetDestination(subsid, index, structure)^ 

,2 subsid € dom substructure A 

.3 let xuk-Substrncture(substruc, ) := structure(subsid) in 

.4 \S:^Seq(subsiruc) A 0 < index < leusubstruc 

.5 vost' GetDestination(subsid, index, structure) (destination) ^ 

.6 let mk''Substructure(substruc, ) = structure (subsid) in destination = subsiruc[index] 

129.0 type .' GetLengih : (Subsid)^ Structure ^ No 

.1 px^rGetLenqtU subsid, structure) ^ 

.2 subsid e dom substructure A 

.3 let \^Substmcture(snbstruc, ) = structure(subsid) \m±^Seq(substruc) 

A ^03UGetLengih(subsid, sti^ucture) (length) ^ 

.5 let \x^Substructure(substruc, ) = structures(subsid) m length = ien. subsiruc 
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Map Operations. 



130.0 typo; AddDcstinaiwn : (Subsid x Name x Destination) ^ Structure ^ Structure 

.1 inj^AddDesttnationfsubstd, name, , stnicture) ^ 

.2 substd G doni structure A 

.3 let uik- Suhstructurefsubstruc, ) = structurcf subsid) in 

A \^Map(substruc) A name ^ dotn subsiruc 

.5 ^lMc.AddDesiination(subsid,name,destinaiion,strvcture)(s\ibslriict\irt* )^ 
Jot ii\k^ Substructure (snbstruc, attrs) - structun (^uhstd) \n 

131.0 .type; RtmovcDcstinatwn : (Subsid x iVrt/nc^ ^ Structure ^ Structure 

.1 uro/^e mo vcI)cstination(subsidj namcj structure) ^ 

.2 subsid G dorn stnicture A 

.3 jet rx\k- Substructure(snbstruc, ) = structure ( sub sid) m 

.4 \^Ma])(subsiruc) A «rime G cloin subsiruc 

.5 ]io^licmovel)cstination(subsid^name,siructures)(stniciure* )^ 

.6 M JLIiL^^'«'^-^'''*«^<?^rr/'s«ft.s7r«c, fl//r.s^ - stnicture(subsid) \n 

.7 structuiv*=siT^ctun-l-[subsid ^uA^^Substruclurefsubstruc \ {name }, a^rs^] 

132.0 typo; UuvingDestination : (Subsid x Dtstinatwn) ^ Structure ^ Name-sa 

A ]yr();^I{(ivtiigDcstina1ion(subsid, , structure)^ 

.2 substd G (lorn structure A 

.3 lot \uk:^Substructurc(subslruc, ) - structure(subsid) m \i2^Mar(substruc) 

A \jOi^Havii}gI)cstination(subsid, desfiuation, stnicture) (names) A 

•*^> \£i ink:_Substructurf' (subsiruc, attrs) - structure (substd) \n 

.6 Mrtmes - {nflmt' | fname G d otu subsiruc) (substruc(name) ~ destination)] 

133.0 type-; GctDcstination : (Subsid x A^/me^ ^ Structure ^ Destination 

.1 invi;^GriDcstination(suhsid, name, structure)^ 

.2 subsid G dorn structure A 

••1 M iuk:.!^^^b^itruciure(subsiruc, ) structnre(subs3d) m \_^Map(substmc) 

A in^CetDcsiinaiion(subsid, nawe. sirucinrfs) (drstinaiion) ^ 

.5 Kh mk^^'«6A-/rwr;/«rcf6*«6x/nir^^ rtZ/rs; ^ structure (subsid) m destination =: subsiruc(name) 

134.0 tij)o; Gi iDomain : Substd ^ Siructure ^ Name-scl 

•1 l>L!t.Get Domain (subsidy striicture) 

.2 subsid G dom sintcturc A 

.3 lot mk;^5H6^/nir/wn'('.sH6A/n/c, j ::r strncture(subsid) \n\H:^Map(substnic) 

A i)O^GelDomain(subsid, structure) (ns) ^ 

.5 Mmk^''«i'.s<ruc<wnY.sH6.v/nir, ai/rs*; r: structure(subsid) m ns = doni subsiruc 
Tho Structure Attribute Operations. 

135.0 typo ; >lrfrf/l//n6u/e ; ... 

130.0 tvr)o ; RemoveAiinbutc • ... 

137.0 type ; AssiynAttribute : ... 

138.0 typo ; lieadAttnbutc : ... 
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Hypertext is most useful as a technology when it is embedded in an application: a paperless technical 
manual, a notetaker, a specification management systern^ or any other task domain where it is useful to 
represent and manipulate the structure of text. We feel that it is important to connect system 
requirements for hypertext with the situation of use; thus standardization efforts should be directed at 
enhancing the ability to embed hypertext in heterogeneous applications environments. 

This paper addresses a specific application and task environment - using hypertext as a medium for a 
shared notetaker that will be used in the intelligence community - and how it suggests a protocol-driven 
approach to integration. The work described in this paper includes an informal work practices study of 
the task environment, and the development of a functional specification for a hypertext system for 
notetaking. 

From the study and the development of a specification, we postulate that standardization of a 
multi-tiered system of linking protocols will help address the closed-world problem that we have 
encountered in NoteCards and many of the other second-generation hypertext systems without 
specifying rigid standards for applications that want to share information to a greater or lesser extent 
with a hypertext substrate. Such a system of protocols can be based in part on existing work on 
hypertext exchange and hypertext reference models. 

First we will briefly describe the task environment and present an informal model of the task. Then we 
will go on to describe linking and anchonng requirements in support of this task. Finally, we will argue 
that a multi-tiered system of linking protocols will not only meet the needs that we have already 
identified, but will be adaptable as the environment changes and will facilitate information sharing. It is 
this set of protocols that we propose should be standardized based on negotiations between 
applications developers and the hypertext community. 



1. \ j { 
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An informal model of analytic activities 
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The specification wo developed describes a hypertext system to support intelligence analysts in their 
notetaking and other sense-making activities. We based the specification on requirements derived 
during the course of an informal work practices study that we conducted at the user site, coupled with 
our previous understanding of the idea processing task (see [Halasz et al. 1987], [Trigg et al. 1986] , 
and [Trigg et al. 1987] for discussions of various aspects of idea processing in NoteCards). 

The analysts we studied work in a rich, complex environment of systems and information sources. 
From these sources they gather information, mostly by scanning the cables they receive through an 
institutional mail system, or by retrieving information from a variety of on-line resources (including 
outside information services like Dialog). Thoy read and interpret information they gather, manifesting 
their interpretation in one of several ways. Sometimes they take notes on what they read or annotate 
the sources before filing them in their personal on-line or hardcopy file systems; in other cases they 
reflect their understanding of the material by simply filing source material or organizing it in response to 
a specific assignment. The product of this interpretation process is usually either a formal written 
analytic paper, or a shorter (and less formal) article. 

Thus, information gathering and retrieval, interpreting sources through notetaking and filing, and 
authoring reports are all important parts of the analytic task. These processes interact in a variety of 
ways; notetaking can be driven by information gathenng. culling an electronic mail inbox, or it can be 
driven by the production of a written report. Retrieval needs may be refined in the interpretation 
process as the analyst tries to make sense of the information at hand, or they may be related directly to 
a specific assignment. Structures to organize information may also be dictated by either sources or 
products, or by the internal models of a domain that an analyst has evolved over his or her career. 
Finally, presentations may be prompted by analytic requirements, or they may be driven by new 
interpretations that come out of the earlier processes in the flow. 

Furthermore, we found that the broader categories of analytic information processing are collaborative 
or coordinated with people in other organizational roles. Interpretation is often coliaborativ-i, sometimes 
involving telephone conversations, or (less commonly) informal face-to-face meetings. Interpretive 
collaboration is initiated by three different types of questions: (1) "What do you make of it?" (2) "Do 
you agree with this (or can you corroborate this)?" and (3) "What are the implications of this?" If the 
collaboration looks to be fruitful, a draft-passing co-authorship is negotiated between the two analysts, 
hence starting a presentation-phase collaboration. Coordination occurs in retrieval tasks in two ways: 
(1) Some members of the analytic work group have specific expertise in retrieval and can help an 
analyst gather information he or she needs from the institutional or outside sources. (2) Some analysts 
have specific resources (like their own extensive files); it is a coordinated effort to locate the desired 
Information from those files. 
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Figure 1 sketches the flow between the categories of analytic activities and shows how they may be 
conducted in a collaborative setting. 
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Figure 1. Analytic information processing activities 



In order to determine requirements for hypertext in the context of this task environment, it is important 
to investigate three areas: (1) where the information comes from; (2) the relationship between the kinds 
of notes analysts take and the information sources; and (3) what use the informatio ^ put to after the 
interpretation is complete. From looking at (1) and (3). we will be able to determine a strategy for 
integrating hypertext into an applications environment, and from (2), we will understand requirements on 
linking pieces of information together. 

Where information comes from. The analysts we studied use a variety of sources, some cu. rently 
available on-line or destined to be on-line in the foreseeable future, and others thcit will continue to be 
available only in hardcopy forms. Frequently cited anecdotal evidence suggests that only five percent 
or so of the available data is ever used in analysis; therefore, analysts all feel very strongly about pulling 
in material from a variety of sources and processing as much of it as possible. It is a widely held belief 
in the intelligence community that contradictory analytic results stem from the Mse of different sources, 
rather than from different interpretations of the same facts. 

We have categorized the sources of on-line information that analysts use into four groups: personal files 
and databases, information from systems maintained by the analyst's working group, information from 
institutional databases and mail systems, and information maintained external to the organization such 
as open literature databases. These catagories suggest that there are varying degrees of control that 
hypertext developers will have over the systems and databases supplying this information. At best - as 
in the case of personal files and working group databases - the hypertext substrate will be able to 
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represent and display the information at both ends of a link; at worst - the cases where commercial 
information sources are used - the hypertext substrate will only be able to represent a method for 
initiating the outside application. 

In our study the most important source of day-to-day on-line information is the institutional mail system 
that supplies each analyst with cable traffic, filtered by an interest profile. Each analyst described a 
process of going through the day's institutional mail in a linear sequence and deciding which messages 
are of interest. Currently, these messages are hardcopied for further processing, mainly highlighting 
and otherwise marking them up. Therefore, the most prevalent example of where the information 
comes from falls between the two extremes. 

How notes are related to sources. The analysts we studied exhibited a range of notetaking styles. 
Many of them relied strictly on annotative notes; that is, they would make hardcopies of source 
materials, and mark up the pages. Annotative notes are taken in two different ways. Often, a 
broad-tipped highlighting pen is used to go over words, sentences, or paragrapfis of particular interest. 
Some analysts have a preference for specific colors when they are doing this type of highlighting 
annotation. The sec and annotative style of notetaking involves writing short notes in the margins of the 
hardcopy. For example, one of the analysts marked things he did not believe to be true, or that he 
found anomolous; he noted those beliefs in the margins. Annotative notes are closely bound to 
selected segments of text; in hypertext terms, they rely on access to a portion of the content of a node. 

We found that the analysts also use Interpretive notes to record hypotheses, conclusions they have 
reached, or material they have integrated from several sources. These notes are frequently taken 
on-line in the text editor; sometimes this style of notetaking involves a significant amount of retyping to 
associate notes with their sources. Analysts also take interpretive notes that do not refer directly to any 
source, or that refer to a computational model. Interpretive notes are less tightly bound to individual 
words or sentences in a document. More often, they refer to a general assimilation of the document's 
content. Thus they frequently point to what woulc" be represented in hypertext as a node. 

All of the analysts in our study made some use of reminding notes, Post-its or other jottings on paper 
that serve to jog their memory about things to do (an agenda of subtasks) or portions of procedures to 
follow (for example, how to log on to a given outside data service, or how to retrieve a piece of 
information). Reminding notes may be an important way of preserving procedural knowledge. These 
notes often do not refer directly to a node or its content, but rather how to get to it; they can be thought 
of as referring to the link. 

Figure 2 summarizes the three categories of notetaking styles we observed in the work group. 
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Figure 2. Analysis of notetaking styles 

How information is used. Information is used two ways; analysts build up personal files and they 
write analytic reports and short articles, artifacts recognized by the community. This paper will not 
discuss our findings about how notes and collected information are filed. Instead we will focus on the 
use of information in analytic products, since one analyst's filing structure is usually opaque to the other 
analysts. It is difficult for analysts to retrieve information from one another's files, and once an analyst 
leaves the organization, his or her files quickly deteriorate in value. Thus, in order to make the 
Information useful to anyone else, the analyst must either document this structure or publish any 
interesting analytic results. 

Two kinds of analytic products are supported by the institutional system, formal publications and shorter 
articles. These analytic products are created by integrating on-line sources and notes, and collections 
of annotated hardcopy material. Most of the analysts pull out their collection of materials on the desired 
subject to create a context for writing and to maintain traceability. which is universally cited as an 
important requirement on (and role for) hypertext. In all cases, the publication of an analytic product, 
and the subsequent usefulness of the document or article is directly related to the ability to, in hypertext 
terms, follow its links back to the sources. 

Once an analytic product has gone through the coordination cycle, it may be used by low level 
policy-makers, by various staff members, and by other analysts (sometimes affiliated with different 
agencies). Analysts expressed a desire for a "lighter weight" analytic product in order to share smaller 
chunks of analytic results with their community and receive credit for coming up with these results; in 
hypertext terms, we might think of this as sharing an interpretive layer over a heterogeneous collection 
of databases. 



Linking and anchoring to support of notetaking 



From our observations about notetaking in the analytic process, we have derived a set of requirements 
on links, how they are anchored, and what this implies about an integration strategy. 



bnks are named, typed, and have direction . Because we expect a variety of relationships between 
nodes (for example, an analyst might want to specify relatior^ships like source, supports, or refutes), 
links must be named. Furthermore, since we expect links to have different characteristics, links must 
have types, so that a behavior can be associated with the named link. In NoteCards, we have found 
that the ability to specify the directionality of a relationship to be somewhat difficult for users; however 
we still feel that representation of the direction of a link may be useful for expressing dependencies. ' 

Links are n-ary. For a hypertext notetaker. n-ary links are important for representing the relationships 
implied by what we have called interpretive notes. An interpretive note can integrate or synthesize the 
mformation in more than one source; hence, the link from the note to the source would require multiple 
endpoints to accurately represent what is going on in the notetaking process. Figure 3 illustrates an 
n-ary link example. In this example, Note #1 integrates material from the highlighted portion of Source 
A and Source B. 



Source A 




Note #1 






—MSP 















Figure 3. Example of how n-ary links may be used in the notetaker 

Links can either connect nodes or refer to nodes. There are two different notions of linking in 
hypermedia systems. Reference links are components within a nod3 that contain a name or address 
that refers to another node (or a region within another node), or a procedure for retrieving that node; 
thus a link's destination can be computed at traversal. Reference linking is important in the case where 
an analyst is performing a query to an external database and wants dynamically computed results. 

Connection links are components that connects a node or region within a node with another node or a 
region within it; the objects at both ends of the link "know" about the li.,k. For the purposes of the 
notetaker. connections will provide a stronger tie between the information at the source and the 
annotative or interpretive note at the other end of the link. 

Unks can be anchored in a span of text. A link anchor is the span within a node corresponding to the 
endpoint of a link. In some hypermedia systems the span may be limited to a single point (eg. 
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NoteCards [Halasz et aL 1987]) or to the entire node (eg. gIBIS [Conklin & Begennan 1988]). Other 
anchoring schemes (eg. Intermedia [Garrett et al. 1986]) may allow anchors to encompass arbitrary 
extents of text (or graphics) within a node. 



Analysts' notetaking practices suggest a need for "span-to-span" links, where an arbitrary region or 
collection of objects can be connected with another arbitrary region or collection of objects as illustrated 
in Figure 4. Span-to-span linking is important to the notetaker because most source-connected notes 
that analysts take generally refer to a region of text. Furthermore, it is important to identify which parts 
of a multi-source note or a document refer to which sources. 



Ir>:aid(i(t)ipr>:t6i)^^ 



The links will be markud in the text by 
highlighiing. Selecting the highlighted 
portion wil! invoke the link traversal 
mechanism. 



Various hypermedia systems have different notions of 
linking; 

MoleCards has point-to-node links. 



Figure 4. Span-to-span linking 

More specifically, span-to-span linking supports the kind of annotative notetaking that we have 
observed. The anchoring and marking process is similar to the highlighting that analysts use to set 
apart a region of text. In this case, it is the delimiting of text that is important; a special link type can 
support this span-to-null link. The ability to include marginalia as annotations depends on using a 
span-to-node or span-to-span link. See [Catlin et al. 1989] for an example of how span-to-span linking 
can support annotation. 

Links are marked to reflect their properties. Link markers are the method by which the system 
indicates the presence of a link anchor to the user. What information a link marker displays should 
reflect its function. Link markers in the notetaker should allow an analyst to r^etect the presence of a 
link without requiring extra action (as an annotation can be detected), distinguish the level of integration 
of the link's destination, and determine the scope of the anchor's span (as highlighting shows scope). 

Links can be annotated. Because procedural or reminding notes sometimes refer to links, rather than 
to nodes, links should have the ability to be annotated. In the case of very shallow linking (where the 
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actual reference is not sufficient to resolve what should be at the other end of the link), link annotation 
can supplement automated link resolving mechanisms. 

Levels of integration 

This set of requirements on links, coupled with the analysts' need to trace notes and finished 
intelligence back to its sources and their use of a variety of tools in tho sense-making process, leads us 
to a multi-tiered integration scheme. Of the different tools and applications available in the analysts' 
environment, some will be more amenable to deep integration than others. Furthermore, we have found 
that the various kinds of notes that analysts take require greater or lesser connection to outside 
information, and that in some situations, the payoff for deeper integration is large, while in others, 
shallow integration is all that is necessary. 

We have divided integration into three levels, listed in order of depth: (1) data or content based 
integration; (2) tool or node based integration; and (3) display or window based integration. This list 
suggests a need for three protocols, which we feel are general to embedding hypertext in a 
heterogeneous application environment: an anchoring protocol, a linking protocol, and a launching 
protocol. Figure 5 summarizes the relationship between the protocols and the depth of integration. 
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Figure 5. Relationship between protocols and depth of integration 

At the deepest level, integration requires access to the content of a node. Integration at this level 
implies that applications must obey an anchoring protocol to descnbe the extent of the anchor within the 
node, a linking protocol to retrieve nodes fiorn applications outside the notetaker, and a display protocol 
so the notetaker can present the node in a suitable window. Deep integration makes it possible to treat 
information from outside the hypertext system the same way as it is treated within the system; thus 
traversing in a link is the same as it would be are the node maintained by the notetaker, 

At the next level of integration, linking is supported so nodes of information from other applications can 
be included; in this case, the .-application only needs to implement the linking and display protocols. In 
this case, traversing a link is a reuk^yal o1 a uiece of information cul.'^iit; the notetaker. 
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Display-based integration is the most superficial of ttie ttiree levels. The purpose of display-based 
Integration is to provide access to outside tools; at this superficial level of integration, traversing a link is 
a launch of an application in a window. 

Figure 6 shows a hypothetical notetaking situation, where an analyst has taken a note referring to thrae 
outside sources, one at each level of integration. The first text span of the note is integrative, and 
refers to the first two outside nodes; protocols tell the notetaker how to launch each application and 
retrieve the appropriate node. Because the node from the first application supports anchoring, the 
extent of the anchor's span is also marked. The note's second span of text refers to the entirety a 
node in the second application; linking is supported, but anchoring is not, so only the node can be 
retrieved and displayed. The third span of text in the notetaker's noow refers to soirie portion of the 
application launched in the third window. Since neither linking nor launching is supported, the 
application can only be brought up in a window. The annotation on the third link object is the "ser's 
procedural note describing how to get the proper information from the third application. 
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Figure 6. Hypothetical notetaking situation contrasting levels of integration 

Defining the three levels of protocol will allow the launching, linking, and anchoring specifications to be 
expressed and stored in t^e link objects, and understood by the outside applications to the degree that 
they support the protocols. 
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Conclusion 

In this paper, we argue that standardization efforts should not only be concerned with a hypertext 
reference nnodel, but also a multi-tiered system of protocols for integrating information from a 
heterogeneous applications environment. We make this argument using evidence from a study of a 
sense-making activity, taking notes in the performance of an intelligence analysis task; we feel that this 
activity is representative of a wider class of idea processing tasks, and that the applications environment 
shares many characteristics with other environments where hypertext will provide particular leverage on 
work involving representing and manipulating the structure of text. 

The study we have performed shows that the closed-world assumption at the root of many 
second-generation hypertext systems limits the ultimate usefulness of those systems, and that future 
hypertext work must consider at least partially open architectures. Thus creating standards for 
hypertext necessarily includes developing protocols for integration of outside applications. Our results 
suggest that three levels of protocols will be useful, an anchoring protocol, a linking protocol, and a 
launching protocol. These protocols can be closely tied to the reference model adopted by the 
hypertext community (see [Halasz & Scnwartz 1989]) to ensure a common description of what is 
included in each protocol. 
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10. Newcomb, Steven R. - Explanatory Cover Material for Section 7.2 ofX3Vl.8M/SD-7 
Explanatory Cover Material for Section 7.2 of X3V1.8M/SD-7, Fifth Draft. 

Steven R. Newcomb, 
Vice Chairman, X3V1.8M. and 
Associate Director, Center for Music Research, Florida State University 

The mission of the ANSI X3V1.8M Music in Infonnation Processing Standards (MIPS) 
committee is to develop a Standard Music Description Language (SMDL) to enable 
interchange of musical documents. The committee has chosen to represent the structure 
of the information represented by SMDL as a Standard Generalized Markup Language 
(ISO 8879-1986) Document Type Definition (an "SGML DTD"). 

In the course of its work (which began in 1986), the MIPS committee developed a 
general model for the representation of schedules for the execution of events. When it 
confronted the problem of representing music in several of its normal contexts, such as 
the interdependently synchronized lighting, staging, and orchestra cues in musical 
romedy and opera, the MIPS committee developed SGML-based means of representing 
links within and among documents. These means are what is set forth in the following 
extract (Section 7.2 f'General Links"! of the fifth draft of X3V1.8M/SD-7 
["Hypermedia/Time-based Document Subset"). 

When it became clear that this model would be useful for the representation of the 
scheduling of non-musical (as well as musical) events multimedia and hypennedia 
documents, the committee extracted the time model from the other, strictly music-related 
portions of SMDL, gave the model a name ("HyTime"), and placed it in its own Standing 
Document, X3V1.8M/SD-7. In the current draft of SMDL, Standard Music Description 
Language (SMDL) is an application of HyTime. (The rest of SMDL is described i". 
X3V1.8M/SD-8.) 

When HyTime 's "General Links" facilities were discussed at the NIST Hypertext 
Workshop, it turned out that the Dexter, Intermedia, and HyTime models all decomposed 
the problem of document addressing in much the same way, although their jargon was 
dissimUar. The "Room 705 Ad Hoc Group" (Ed Fox, Steve Newcomb, Tim Oren, and 
Victor Riley) succeeded in showing how the "anchor" concept in the three models could 
be merged. It is anticipated that the NIST Hypertext Workshop will have significant 
impact on succeeding drafts of HyTime. 
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X3V1.8M/SD-7 Journal of Development 

Standard Music Description Language (SMDL) 

Part Two: Hypermedia/Time-based Document Subset (HyTlme) 

EDITORS: 

Charles F. Goldfarb, IBM Aimaden Research Laboratory 
Alan 0. Talbot, New England Digital Corporation 



Includes work as of June 22, 1989. Effective through October 31, 1989 



7.2 General Links 

General links are relationships between documents or parts of documents. The set of 
potential general links is infinite, so the mechanisms provided by HyTime are extensible by 
users and applications. 



Note: The [erm "general link " is used in preference to the unqualified term "link" to 
avoid confusion with the SGML link feature. However, there is no problem in using 
"link" with more restrictive qualifying adjectives, as in "hypertext link," or with no 
qualifiers when the context is clear. 

Some fomris of general link occur In all documents, not jusi those intended for hypertext and 
hypermedia access. Those forms are represented by inherent SGML functions, so HyTime 
does not need to address them. 

Note: Some examples are: 

~ Links that associate a semantic role (such as "paragraph" or "heading") with an 
element are represented in SGML by generic identifiers. 

— Other links that associate a property with an element (rather than associating two 
elements with one another) are usually represented in SGML by attributes. 

Note: ("EDITOR") We may want a specialized link element nonetheless, 
for those cases in which the document cannot be modified to add an 
attribute. 
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~ Links that specify layout or typography, or other processing of a docunfient, are 
represented by the SGML link feature. 

— Links between the logical structure of the docunrient and physical storage are 
expressed by the SGML entity mechanism, which includes the ability for a user to 
segment and link a document physically on whatever boundaries he requires. 

The following forms of general link are supported by HyTime, either via inherent SGML 
mechanisms, or by elements and attributes defined in this Standard. (The list is derived 
fronn "A Tentative Listing of Some Linktypes'* on pp.4/52-4/55 of Ted Nelson's Literary 
Machines, Edition 87.1) 

Note: (••EDITOR**) This list represents one view of the requirements for general link 
support, and as such provides an initial touchstone against which to evaluate the 
language design. It is provided merely as a starting point, and it is expected that 
others will suggest additions and modiflcations to both the list and the design. 

a) meialinks 

title 
author 

author (external claim) 
document supersession link 

b) ordinary text links for sequential documents 

correction link 
comment link 
counterpart link 
translation link 
heading link 
paragraph link 
inclusion 

quo!6*link (annotated inclusion) 
layout, typography, epigraphy links 
footnote link 

c) hypertext links 

vanilla jump-lirik 
modal jump-links 
suggested-threading links 
expansion links 

d) literary links 

citation link 
alternative-version link 
comment document 
certification links 
mail link 

Links can also solve the unique structural problems of interactive multimedia documents, 
9uch as instructional materials. For example, when the normal sequence of elements is 
interrupted by a user response, links in audio material could indicate suitable jumps to 
graceful endings. 

In HyTime, general links all consist of one or more "link ends" (Nelson calls them "end 
sets"), together with a description of the purpose of the link (the "link type"). A general link 
also has an associated "link term" that an application displays as a "button" from which the 
link can be accessed. In character text, the link term is a word or phrase that is the subject 
of tile link, and the ''button'' is usually the link term in a highlighted font. In other data, the 
link term is a location (for example, a coordinate in a displayed image), and the button might 
be a cursor that changes shape when it is over the link term location. 

Note: (••EDITOR**) Do we need the potential for a link term at each link end? 

HyTime includes four element types that represent general links: 
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- The independent link is the most flexible. It can have any number of link ends and they 
can be in any documents, aven those to which there is no write access. 

- The contextual link has only two link ends, one of which is at the location of the 
contextual link element. 

- The excerpt is a special form of contextual link that is used for including portions of other 
documents, with or without acknowledgment. 

- The location reference is a special form of contextual link that is used for automatic 
cross-referencing within a document. 



7^.1 Independent Link 

The element independent link {ilink) represents a general link whose link ends are 
mdependent of the ilink element itself. The content of the ilink element, if present, is the link 
term. 

An independent link occurs, as its name implies, out of the normal context of the document 
Its location need have no connection with the location of its link ends. 

Note: An ilink can be used in situations where it is not possible to modify the link 
end locations. If one of the link ends can be modified, it may be more convenient to 
use a contextual link (see 7.2.2). 

The attribute iinlcends (link ends) identifies one or more locations that are the subject of the 
Imit Each can be a document location, data entity location, or some other element 
mcluding another general link. The number of link ends, and their meaning, are a function 
of the link type, which is determined by the application. 

The attribute Independent linic.type (ilinktyp) identifies the purpose of the link. The possible 
values are determined by the application. 

Note: Uses for independent links include comments and notes by reviewers and 
collaborative authors, external thesauri and indexes, and identification of various 
kinds of alternative v ersions. 

The attril5ute link term {linkterm) identifies the link term of the link. If not specified the 
content of the ilink element is the link term. 

The entity a.ilink allows additional attributes to be defined. 



7.2.1 Independent Link — > 
<!ELEMENT ilink: - Independent link: independent of its location (included) -- 

- 0 ANY > ' 
<!ENTITY % a.ilink ■ • - User-defined independent link attributes > 
<!ATTLIST ilink id - Used when this ilink is linked to - 

ID IIMPLIED 
linkends ~ Ends of link: element, docloc, or entloc — 

lOREFS IREQUIREO 
Ilinktyp - Purpose of link (application-defined) 

CDATA IMPLIED - Default: implied by GI 
linkterm — Index term or "button" location — 

IDREF <sCONREF - Default: content of ilink - 
?fa. ilink; > 
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7.2.2 Contextual Link 

The element contextual link {clink) represents a general link with two link ends. One of the 
link ends is the content of the contextual link element, which must be valid in the context in 
which the clink element occurs. The content can be entity if the link end is simply a point in 
the text, rather than a span of a character string. 

A contextual link occurs, as its name implies, in context at exactly the location of one link 
end. The content of the contextual link element, if it is not empty, is the link term as well as 
a link end. It is also treated as part of the content of the containing element, just as if there 
were no clink tags around it. 

Note: A clink can be used only if the link has only two ends and one of them can be 
modified to incorporate the clink tags. In other cases, the independent link can be 
used (see 7.2.1). 

The attribute linkend {link end) identifies the other end of the link. It can be a document 
location, data entity location, or some other element, including another general link. The 
meaning of the link end is a function of the link type, which is determined by the application. 

The attribute contextual link type (clinktyp) identifies the purpose of the link. The possible 
values are determined by the application. 

Note: Uses for contextual links include various forms of hypertext links and 
alternative access paths through a document. 

The attribute automatic return (return) indicates whether processing of the document returns 
automatically to the end of the clink after processing the link end. 

The entity a.clink allows additional attribute? to be defined. 



<!— 7.2.2 Contextual Link — > 
<!ELEMENT clink — Contextual link: nested subelement of its parent 
- 0 ANY > 

<!ENTITY 't a.clink • • — User-defined contextual link attributes > 
<!ATTLIST clink id — Used when this clink Is linked to — 

10 ilMPLieO 
linkend — Other end of link: element, docloc, or entice — 

IDREF #REQUIRED 
clinktyp " Purpose of link (application-defined) — 

COATA ^REQUIRED 
return — Automatic return at end of linkto element — 

(returr'noretum) noreturn 
%a. clink; * 



7.2.3 Excerpt 

The SGML external entity reference is the normal vehicle for including text from one 
document within another. Such inclusion is transparent, in the sense that if the included 
material is itself represented in SGML, an SGML parser will deal with it without advising the 
application program. Therefore, if an application wishes to acknowledge that certain 
material is included from other documents, an additional construct is required. 

The element excerpt {excerpt) is a type of contextual link that identifies a portion of another 
document (the "excerpt source") that is included in this one. In other words, the excerpt 
source replaces the excerpt element. The included text must 1)8 valid in vhe context in which 
the excerpt eiement occurs. 
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The attribute quote (quote) indicates whether the existence of tha inclusion is made evident 
to the reader of this document. 

The attribute excerpt source (xsource) identifies the location of the text to be included. It 
points to a document location or data entity location element that describes a location in a 
document other than the one in which this excerpt element occurs. 

The attribute acknowledgment {ack) identifies the location of acknowledgment data for the 
included material, such as a copyright notice. The acknowledgment can be in any notation 
suitable for use in conjunction with the included materia!; for example, an image that can be 
overlayed on an included video clip. 



<!" 7.2.3 Excerpt --> 
<!ELB'1ENT excerpt — Part of another document included in this one — 

- 0 EWPTY > 
<!ATTLIST excerpt id ID IIMPLIED 

xsource IDREF IREQUIRED 
quote — Reveal existence of excerpt — 

(quote|noquote) noquote - Default: conceal ~ 
ack — Acknowledgment text — 
IDREF IIMPLIED > 



7.2.4 Location Reference 

Applications that use HyTime will frequently define specialized link elements for 
cross-references to headings, footnotes, and figures. When a document is presented, the 
reference elements are replaced by the heading text, footnote numbers, or figure captions of 
the elements to which they refer. The location reference element, in conjunction with the 
location elements defined later, offers a generalizc^d mechanism for such cross-references. 

The element location reference {locref) is a form of contextual link whose other link end is a 
location element. An application will normally process a location reference by replacing it 
with data that is derived from (but is not necessarily identical to) the content of the link end. 

Note: A location reference therefore differs signficantly from an entity reference: the 
latter, is an SGML construct whose behavior is defined precisely by ISO 8879, while 
the behavior of a location reference is entirely application-dependent. 



<!-- 7.2.4 Location Reference — > 
<!ELEMENT locref — Reference to a location element — 

- 0 EMPTY > 
<!ATTLIST locref id ID IIMPLIED 

idr IDREF IRtQUIRED > 



7.2.5 Locations 

A general link must refer to one or more locations in documents. SGML provides two 
inherent constructs for identifying locations: 

a) A unique identifier ("ID-) attribute, which identifies a complete element in the same 
document as the reference to it. 
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b) An entity name, which identifies a complete entity (frequently data without SGML markup) 
In the same document from which it is referenced. 

These constructs are insufficient by themselves for general links, because the link ends of a 
general link could be outside the document in which the link occurs, or they could constitute 
only a portion of a data entity or element. For these reasons, HyTime supplements these 
constructs with several "location" elements that can be used separately and in combination 
to represent the following locations: 

a) In a data entity, a point or a span of data, either 

1) in terms of a data content notation (e.g.. a video frame number, a coordinate in space, 
an offset in time); or 

2) In terms of the uninterpreted characters. 

b) In an SGML document or subdocument entity, either 

1) the entire document or subdocument; or 

2) some identified element within it; or 

3) some data location within the identified element (interpreted or uninterpreted). 

Note: ("EDITOR") In the next edition, the element location facility will be 
extended to address a span from one element location to another. 



7.2.5.1 Data Entity Location 

The element data entity location (enWoc) identifies a portion of a data entity. The data could 
be "character set data." or it could be "notation data," which must be interpreted according 
to a particular data content notation. The portion could be a single point, or a span of data 
between two points. 

The attribute data entity name {dataent) identifies the data entity to which the data entity 
location refers. If not specified, the data entity is the same as that of the previous entioc 
element. 



<!— 7.2.5.1 Data Entity Location — > 
<!ELEMENT entioc — Identifies a portion of a data entity • 

- 0 (cdloc I ndloc) > 
<!ATTLIST entioc id ID ^REQUIRED 

dataent ENTITY #CURRENT - Default: previous entioc -> 



Character Set Data Location 

The element character set data location {cdloc) defines a single point in character set data, 
or a span of data between two such points. 

The element character set data point (cdpcint) defines a point in character set data. The 
point is represented as an integer offset from the first character in the data. A value of 0 
refers to the point prior to the first character, except when only one cdpoint is specified in a 
cdloc, in which case it refers to the point after the last character. 

Only characters that an SGML parser passes to an application are counted (for example, a 
record end afTer a start-tag is not normally treated as a data character). 
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<?— 7.2.4.1.1 Character Set Data Location — > 
<!ELD1ENT cdloc Character set data location — 

- 0 (cdpoint, cdpoint?) > 
<!ELEMENT cdpoint -- Character set data point 

" Offset from first significant character — 

— e » before first char (after last if only one cdpoint) — 
0 0 (mPCDATA) > 



Notation Data Location 

The element notation data location (ndloc) defines a point or a span between points in data 
that is subject to interpretation by a data content notation. The representation of the point or 
span is not defined by this standard; it depends upon "the notation in which the data itself is 
represented. 

In HyTlme applications, the data would normally represent occurrences in space, time, or 
both, so a notation data location would consist of offsets on a visual coordinate system, 
and/or elapsed time values. Some notations also provide the ability to "label" items for 
identiflcation. In such cases, a notation data location could refer to such labels. 

The attribute snap {snap) indicates whether the specified location should be adjusted to 
conform to alignment or synchronization points in the data. The specified location can be 
"snapped" to the nearest, next previous, or next following alignment point, or not at all. 

Not*: Graphics representations commonly have an associated "grid" to which 
objects can be "snapped" in order to assure alignment and/or a minimum resolution. 
Similarly, representations with an Internal time bases frequently include 
synchronization points, such as frame markers in SMPTE encoding of movies and 
video. 

Note: ("EDITOR") it may be possible to define a generalized method of referencing 
space and time locations that would serve for a wide variety of notations. Such a 
method could be incorporated into HyTime as the definition of an ndloc element. The 
snap attribute is an example of one possible parameter. Suggestions are invited. 



<!— 7.2.4.1.2 Notation Data Location — > 
<!ELE^iENT ndloc -- Notation data location — 

" Offset in time or space and duration or size, or label — 
- 0 (formula) — Depends on data content notation — > 
<!ATTLIST ndloc snap -- Specified point is changed to aligned point — 

(nearest|before|after|none) none > 



7.2.5^ Document Location 

The element document location (docloc) identifies a portion of an SGML document by 
means of an optional element location, and an optional data location within that element, if 
no element location is specified, the "element" is the entire document. If an element 
location is specified, but no data location, that complete element is the "document location." 

The attribute document entity {docent) identifies the entity in which the document begins. If 
omitted, it is the same entity in which the docloc element occurs. 
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<• ELEMENT doc'oc 



<!ATTLIST docloc 



<!— 7.2.4.2 Document Location — > 
— Identifies a portion of a document or subdocument — 

— Entire document if element location is omitted — 
Entire element if data location is omitted — 

- 0 (elemloc, (cdloc | ndloc)?)? > 
id ID IREQUIREO 

decent ENTITY #IMPLIED Default: this document --> 



Element Location 

The element element location (e/em/oc) identines an element either by a unique ^^^rne.or by 
a ^u^nce of "node locations." called a "node path." The element location pe mUs a 
geSe?al link to refer to an element in a different document or to an element (m any 
document) that does not have a unique identifier attribute ( ID ). 

The attribute element identifier {elemid) is the -unique identifier .("'D:) f,';^";^.. 
element whose location is being identified. If the element has no unique identifier, its node 
path is used instead. 

Notes: 

a) The attribute elemid is not declared to be an "IDREF" attribute because its value may be 
' an ID from Another document. An SGML parser will normally check for the validity and 
u^ueneTs of an IdREF. but cannot do so for an ID from another document, as it could 
conflict with an ID from this document. 

h) The keyword "^fCONREF" identifies a "content reference attribute." If a value is specified 
^ for theTttribut; the SGML parser will expect the content to be empty (^."1 v'" ^e^)^ 
The application is expected to use the attribute value in some way as a substitute for the 
data that would ordinarily have been in the rrjntent. 



ERIC 



<!" 7.2.5.2.1 Element Location — > 
<!EL0-1ENT elemloc Identifies an element of a document or subdocument - 

. 0 (nodelor.+) > 
<!ATTLIST elemloc elemid NAME *CONREF Default: use node path -> 



Node Location 

The element node location {nodeloc) identifies the sequential position of an e'^ment among 
its siblings in the tree structure of the document. The node location is an integer greater 
than zero, and each separate data portion in mixed content is treated like an element when 
counting. 

Note: For example, in a paragraph consisting of some character data followed by a 
quotation element, and then some more character data, the first character string 
would have a node location of "1." the quotation a node location of 2. and the 
second character string a node location of "3." 

Any element, including the pseudo-elements containing the data in mixed content can be 
identified uniquely by a "node path" consisting of an ordered sequence of the node locations 
of itself and its ancestors, starting at the root of the document tree. 
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Note; For examp 3, in a docui.-ient with the following structure: 

<corexn»nseqxcesxce><ce></ces></mmseq><batonxtempo><tempo></baton></core> 
the second tempo element can be identified by the node path: 
1 2 2 

An element that is ernpty cr that contains only data (including the pseudo-elements 
contaimng data in mixed content) is a leaf of the document tree. Its data does not have a 
node location, but can be addressed with a data location element 



<!— 7.2.5.2.2 Node Location — > 
<!ELEMENT nodeloc — Node location: integer > 9 (each #PCDATA is one) 
- 0 (#PCDATA) > 



7.2.5.3 Point Location 

I?r!rfl^.TH"\»'***'"* location (po/nWoc) identifies a point in an element so that it can be 
referenced, its content, which is optional, can be used by an application to describe the 



Note: For example, when printing a cross-reference to it. 



<!— 7.2.5.3 Point Location — > 
<!ELEMENT pointloc — Identifies a point in an element - 

— Content can be used by application to describe point — 

- 0 ANY > 

<!ATTLIST pointloc id ID dfREQUIRED > 
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Distributed Hypermedia Standards 

A Position Paper for ihc NIST Hypertext Standards Workshop 
Tim Orcn, Apple Computer 



1. Directions for Hypertext Standards 

Much discussion of hypertext standards has centered on the transfer of 
closed, static hypertext document bases among various platforms and 
organizations. While there is an undoubted need focused on the use of 
hypertext with optical media and technical documentation, the thesis ol 
this position paper is that any standard based primarily on this limited 
application will be necessarily flawed. 

The original vision of hypertext was a universally shared, dynarnic 
"docuverse- which could be read and written by a 1 users. Al hough 
systems short of this grand vision have proven utility wc would not w,sh 
10 abandon this future or the smaller scale visions of '^^P^V^ "/^ . 

enterprise-wide hypertexts. Nelson proposed that one unified I ackend 
to age mechanism "Xanadu," would solve the distributed hvpertex 
problem for all [Nelson 80]. Though the Xanadu sysleni ,s now advancing 
fowaui commercial release, it comes late in the day. There are a ready 
established commercial hypertext systems and si/.able collections ol 
content which are unlikely to be abandoned. 

Hence if wo warn the docuverse to become reality, we must build it in the 
distributed, multivendor computing milieu of today. To bring togethei 
The divert software and hardware systems already existing we will need 
abstract models of hypertext and ultimately standards based on the 
n'oie I this work is to be viable. ,he results must also reflect technica 
nd market realities, and interaction with other areas such as mu timcdia 
and compound documents must be considered. In the ' '"^ 

paper, I examine some of the requirement po.sed by these ^^^f'^'^ 
nroDOse design principles for meeting these requirements, and suggest 
thT an ope? syLm architecture should be the ultimate goal of hypertext 
standardization efforts. 

2. Technical Condition:^ 

Working in today's computing environment means working wilh existing 
networking and file standards. These ^rc charactcw.cd by loo^^^ 
connectivity and modest reliability. Not only do LANs and WANs b cak 
down but many connections are deliberately noncontinuous lor cost 
t'o^s Remote resources such as servers fail and go ollline, o ten due 
to crashes that mean reloading earlier data versions. hxisting lilc and 
device level utilities allow copying and alteration o file - -umei 
structures without warning to the applications which rely on I'^crn. All 
existing standard u.ser interface systems are aimed at this level Hksc 
uUlitief are used routinely to remove partial document collections for 



Z nrt r ^""'"^^^^to o^hcr Sites, and to return modified versions to 

nhno^'L ^yP'^^^^^ '''^^''^ 'f^i'^ environment must be 

obust wl.cn faced wuh a variety of insults lo document identity and l.^k 



layout doc 



This is a Title 



■imply m«anf 
"conUirw work 
thfln on# m«dla 
typ*-. 




'cwnpowtt' woUd U mor» nnng 
tn«n "compound', but 'compount 
document" tf aJr*«dy adopted by m« 
(ni«fnaliooW ttandardt convnunity 

Pag»7 



layout doc 



This Is a Title 



Har»'« 1 hyp#f. 
lif* to an KnA9« 
9"ity »vJ 1 to 

f«i^' t'on ir a 

$n6ry. 




table doc 







A 


B 




loeo 


17 76 










M1 .11 




<}74 03 


17 Ot 


5 




01 OC 


6 


eeooo 


17760 


-r 


179 81 


1W91 





headed tff releronc* 

LinKs can b% us^d for 1iot link^outo 
copy paste), manuafly driven Warm hi 
inclusion by roforonce. navigatior 
6me synch, ptor.ei a^.d mor^ 



image library doc 



Thi^ Ic a Till. 



Figure 1. Compound document Figure 2. Hypemiedia docutncni 

Activity in hypertext standards interacts with other advanced document 
model.. For instance, figure 1 shows a "compound document" where 
vanous text and graphical entities (E) are assembled into a page under 
the control of a layout specification. However, rather than storing the 
n'.T^ '-^'"g'^ fi'^. it "^'i^fu be realized as shown in 

doc, Ln,. T'' ' ^.yP"'"^. ^"^•^^'•^t^' used to implement a compound 
document, the graphic entities arc placed using links (L) to ocrsistcni 
selections (P) within other files. pcisistcni 

[n"';-L'rr7'"?h'^' "^^'"'"r'^ and constraints as well as sialic informaiion. 
nto T nr.n; T T'' .'''f"'^"' transformation of the linked data 
n nrlr P r ^" ^' ''''^ '^''^^y -Sy nchroni/.ation 

.nformation for pieces of dynamic media. Finally, as suggested in figure 

Lien ' . r'"''"^ ■^■"^'^"'^ "''^ Po-^sible ••component 

documents where each entity may be edited in place by software n.odulcs 
selected a runtime by the user. Implementing a component software 
system will require a standard data storage substrate very similar to 
hypertext which vendors of individual components can use and extend. 

Because these is.sues and applications all interlock, i, is not pos.sible to 
restrict a discussion of hypertext standards to static text alone or to 
par icular document models. A standard arrived at in !his fa.shion will 
sufTer one of two fates At best, it will create a "golden ghcKo" where a 
class of hypertext applications may live, but without cotnicctin-' to other 
media types or document models. At worst, it may coop, and prevent 
progress in these areas. ^ k i i 
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Figure 3. Mullimcdia docunicnls 
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Figure 4. Component doeumenls 

In these examples the objects are not exclusively text. They include static 
bit map graphics and object graphics, dynamic animation, sound and 
video. Each of these data types represents a corresponding discipline and 
standards effort. A hypertext standard which restricts itself to text alone 
is crippled at the beginning. One which altempts to reinvent standards 
for each constituent media type would create a ghetto effect, and might be 
simply impractical given the effort required. It would seem that a 
hypertext standard must find a way to embrace existing media type 
standards with a minimum of modification. In the remainder of this 
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discussion, I use hypertext in its most general sense, to indicate the 
scheme for linking all data types, not just text. 



Hypertext as a functioning discipline is quite young, and disagreement 
and lack of understanding of systems architecture and application needs 
IS still rife. There is comroversy at even the fundamental level of 
inking method and storage organization. Various systems implemem 
links as separate webs or within documents, and represent them 
abstractly or procedurally. A recent panel on system architecture makes 
It clear that there is still substantial change, with many systems seeking 
to adapt the better features of the other approaches (Halas/. 89]. It is also 
clear that the diversity of systems is not gratuitous variation, but has 
occurred because of real differences in the intended applications and 
audiences. No "one right way" to do hypertext has emerged. 

Above the storage level, diversity increases further. Labelled links are 
used diversely, to represent constraints, timing, inferencing and 
rhetorical information for the use of both the browser and the software 
User interfaces to large, interlinked data stores are an area of active and 
fruitful research. More complex architectural issues such as versioning 
and searchability are just beginning to be explored. Again, the various 
approaches and progress have been largely driven by the needs of 
particular applications. 

Attempts to standardize in a discipline in such flux must take account of 
the diversity of approaches if they are not to cripple progress. To the 
greatest extent possible, formalisms must embrace the diversity of 
architectures and applications rather than being exclusive or 
prescriptive. 



3. Market Conditions 



In 



A standard must consider prevailing market conditions to be effective 
the case of distributed hypertext, the installed base of machines on 
networks is characterized by wide diversity of vendor, architecture, and 
hypertext software. Significam hypertext systems run on Macintoshes 
IBM PCs and PS/2s, Sun. DEC and other Unix equipment, interconnected 
with a variety of LAN architectures, many also connected to long haul 
networks such as Bitnet or Milnet. Hypertext software is provided by both 
hardware vendors (HyperCard, Sunlink) and independent software 
vendors (KMS, HyperTIES, Guide). Initial market penetration of hypertext 
technology is occurring in the areas of in-house and external technical 
documentation and distribution of multimedia content, particularly on 
optical discs. Substantial commercial and academic efforts are underway 
to introduce hypertext as a mechanism for collaborative work in the 
computing environment. 

Given this diversity of platforms, the resemblance of distributed 
hypertext to the open systems efforts undertaken in networking and 
structured databases is obvious. The existing vendors, applications, and 
users will not be dislodged by either a proprietary specification such as 
Xanadu or a public standard. A successful effort must coopt existing users 
by extending their reach onto other platforms. It should become possible 
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to, for example, read nodes within HyperCard wiihout being necessarily 
aware that they reside in a remote database created in HypcrTIES. 

The technical issue of non-textual data also has a market componeni. Not 
only do standards for various data types evolve separately, but the 
markets for the underlying technology in hardware and software 
progress at their own speed. Of particular importance, there is often a 
succession of dominant applications within a media type. For instance, on 
the Macintosh, MacPai t was surpassed in turn by SuperPaml and 
PixelPaint A standard lust accommodate this process in two ways hirst, 
it must not bind data tightly to its creating application, in order thai the 
user may replace it with another at a later date. Second, the standard 
must be extensible, to allow vendors to compete on features wiihoul being 
required to abandon the standard. 

Another market phenomenon is the decline of the so-called "integrated 
application." The required feature set within each data type has become 
so large that a project or product which attempts to do all becomes 
impractical. Integrated applications linger only at the novice level. 
Much integration is now done by cut-and-paste or data piping facilities at 
the operating system level. 

Hypertext may be viewed as the next logical evolution of integrated 
applications, with the ability to freely browse between all data types. 
Given the issues outlined above, it follows that the hypertext facility will 
need to be implemented at the system level to be effective. A successliil 
standards effort must then include platform vendors and provide a 
mechanism for their joint efforts. 

The hypertext market is quite young. Many of the software vendors are 
startup ventures and are thin on capital and engineering resources. A 
successful standard must address this problem by making 
implementations available to such developers at very low cost. Failure to 
do this would confine use of the standard to high-end markets where 
firms and clients can afford the engineering overhead to implement the 
standard h would also cut the standard off from the most iniiovraivc 
sector of the software market. Even a low cost standard must p.'-scnt 
convincing advantages in integration, power, and room for giowih il 
developers are to give up pn )rielary schemes of data storage. 

4. Design Principles for a Hypertext Standard 

What principles can be deduced from these technical and market 
constraints? First, a standards effort must start with the creation ol an 
abstract model of hypertext which is as inclusive as possible. Because 
many existing hypertext systems were lightly diiven by application 
scenarios, this means looking at a variety of user coinmuimies needs 
Particularly, building any system architecture driven by the needs o one 
application area into a standard would be inadvisable. work of the 

Dexter Hypertext Model is a useful precedent in this area [Halas/. yuj. 

Any standard must be portable to aie greatest extent, not dependent on 
particular processor, display, network, or peripheral architectures. 
Portability will allow the greatest degree of interoperability in the 
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current computing environment, and guarantee survivability onto 
succeeding generations of technology. 

Slfl'^r """"^ r° '"^^'■P^''^'^ c'^'sting data type standards and allow the 
imp ementmg software to evolve independently, a hypertext standard 
mus support modulanty. Data items may be incorporated by reference to 
Ext.n 'in?^ ' ^'^"dard form hypertext 

be minimal '"'^'"^ incorporate hypert... features should 

Lh^^J:T! '''"'^"■^r """'l extensible to support the rapid evolution of 
both data type specific software and notions of usage of link.. Any typing 
mechanism built into the hypertext definition must be open to extension 
Methods must be provided for superseding one representation of a dat 
clement with another without disrupting the entire hypertext. Facilities 
.r f«.M/T incorporating proprietary data representations with 

the facility to point at parallel standard representations 
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Figure 5, Separability: Moving data around 

A principle termed "separability" is important to coexistence with today's 

c iled" ".n"rr'^ "T""'- ■ ' ^-^^'^ organization 

called entities. An entity entap.sulates suflicient data and metadata thai 
It may be moved or copied between files without loss of infotmalion For 
instance, an animation data entity might contain a scricsj of frames 
persistent selections for linking, a color lookup table (GLUT) and a' 

rrZlT''' t[ '^' resolution and depth and processor 

rc.sources. Ihis could be moved in its cniircty. while copying the frames 
alone would lose mfomiation as they were moved out ol coniext. Figure S 
.llustrates this concept, as well as the related lea-u.c thai cniitics must be 
robust in the face of missing linked data. In the partial hypertext 
c ..racted to a remote machine the library image is missing, but sullicient 



layout information remains to block out its location and allow work to 
proceed. 

Separability must be supported with identity and inspectability. A robust 
identity mechanism allows an implementing system to detect 'fa 
referenced entity is missing or present in duplicate. Note that identity 
may be separated from the particular mechanism which a system uses to 
find the referenced entity. Various implementations might keep merged 
databases of entity identity vs. location, or resolve references using 
heuristic mechanisms peculiar to a platform. 

Inspectability means that the interdependencies of entities must be 
apparent to a utility which understands the linkage standards only, and 
has no knowledge of the internal structure of data entities. Such a naive 
utility may then copy or move portions of the hypertext without a need 
for extensions as new entity types are added. 

To allow room for the evolution of hypertext technology, a layered 
standard will be necessary. To permit layering, each portion of the 
standard must be policy neutial. This means that it must allow a wide 
range of choices in how it is applied by higher layers. For instance, a 
standard which specified link formats and also required their storage in a 
single "web" would not be neutral, because it enforces a particular 
implementation. A policy neutral formulation would specify the format 
and possibly behavior of links without specifying in what place(s) they 
must be stored. Policy neutrality also permits the delegation of certain 
design choices to implementors, and provides degrees of freedom for 
technical issues with no current solution. These issues include the 
division of entities and linkage information between files, link typing 
and usage, searchabilily and version management. Again, an abstract 
model is helpful n creating the generalizations needed for policy 
neutrality. 

Standards may be expressed as data formats or as behaviors. A hypertext 
standard expressed as an explicit data format is probably necessary to 
support environments where only serial ASCII or binary data is available. 
This is typical of the bulk transfer of reference hypertexts between 
machines. However, such a format is poorly adapted for update and 
search. Neutrality of applications is better provided by standardization at 
the behavioral level of an application program interface (API). A 
compliant implementation might simply provide access to the standard 
serial hyperlekt form, but would more likely implement a random af s 
or object-oriented filing mechanism adapted for its particular platfon ,. 
The distributed open hypertext environment is then implemented as 
pecr-to-pcer conversations among compliant implementations of the 
standard. 

5. Conclusions 

Standards must be approached cautiously in a field as new as hypertext. 
While we may need interim or experimental spccii'ications for particular 
application areas, making the exchange of static hypertexts the subject of 
a standard is undesirable. Decisions which wc make will ncccssaiily 
affect other areas such as multimedia and compound documents. A 
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prcmaiurc standard could have the effect of ghcttoizing a subset of 
hypertext. The goals of a hypertext standard should be the 
implementation of the vision of distributed hypertext within an open 
systems framework. 
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Abstract 

A necessary first step in discussing standardization in a domain is the development of 
a reference model for that domain, a high-level framework wii^hin which specific topics 
for discussion can be defined and discussed. This paper offers a "straw" version of such 
a framework as a basis for discussion, and discusses the "standardizability" of various 
detailed subjects within that framework. 

1. Introduction 

A reference model, is a high-level description of a domain within which discussion of 
more detailed subjects can be situated. As a mechanism for setting the context of a 
domain, reference models have been useful in several fields. This section gives examples 
of other reference models, suggests some of the uses to which they may be put, discusses 
why a reference model is desirable for hypermedia, and outlines the high-level structure 
of a proposed reference model for hypermedia. 

1.1. Examples of Other Reference Models 

Reference models have been proposed in many domains, including telecommunications, 
factory control architectures, and material handling architectures. 

Perhaps the best known reference model is the ISO-OSI seven-layer model for 
telecommunications. |DAY83l By articulating the various communications functions and 
defining an ordered relation among them, this model ham supported a vigorous and 
productive standardization effort. 

A numbf^r of studies have proposed reference models for manufacturing control; 
(PARU87| provides a useful summary, and [BIEM89, WILLSg] are more recent 
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treatments. These studies have been motivated by the growing Interest in integrated 
manufacturing, and the resulting need to relate the various entitles In a manufacturing 
enterprise to one another in a consistent way. 

In the domain of material handling, the OSI model has been adopted to define a 
layered model for the transport of materla^p.^RUSSj and this model has been used as 
the basis for experimental implementations in our laboratory. 

1.2. The Uses of a Reference Model 

A reference model is useful for description, standardization, design, and innovation. 

It provides a descriptive framework for comparing existing systems in its domain, and 
in fact is often compiled by surveying existing systems for similarities and differences. 
It thus provides an underlying ontology of its domain. 

By Identifying the critical subjects in the domain and showing how they are related to 
one another, it provides a context for standardization. It facilitates discussion of what is 
and is not ready for standardization, identifies specific subjects for standards, and calls 
out where subsystems (and thus the standards that describe them) must interface with 
one another. 

As a high-level analysis of its domain, a reference model guides the designer of a new 
system in identifying the issues that must be addressed and the broad functions that the 
system mu.st provide, as well as suggesting the kinds of solutions that have been 
attempted in the past. 

Reference models not only help to mature a field through development of standards 
and common analyses, but can also foster innovation. At the detailed level, by 
partitioning the problem, they invite the development of new solutions, showing what 
has already been tried. At a higher level, they Invite creative thinkers to challenge their 
overall structure and thus introduce new paradigms. 
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The descriptive and prescriptive functions of a reference model are in natural and 
unavoidable tension. As a guide to classifying existing systems and as a pointer to 
needed innovation, a reference model should be as comprehensive as possible, able to 
embrace any implementation of the domain. As a roadmap for standardization or a 
guide for designers, it should embody design choices that reflect good practice and 
sound engineering, and thus be selective. It seems reasonable to expect that reference 
models will follow a life-cycle that moves from broad and deao-iptive to selective and 
prescriptive. While it may be premature to build prescriptive models of hypermedia, it 
is not at all too early to formulate broad descriptions of the underlying technologies, 
descriptions that through time can evolve into more selective models. 

1.3. Why a Reference Model for Hypermedia? 

A reference model for hypermedia is desirable not only for helping the technology to 
mature, but also for fostering its development as a distributed tool. 

Every worker in a domain has an individual "reference model" of that domain within 
which various contributions to the field are implicitly classified and assessed. A textbook 
in a domain is essentially an instantiation of such a model, and helps newcomers to the 
domain to put in place a mental framework within which to operate. The rapid growth 
of interest in hypermedia makes this educational service particularly desirable in the 
case of hypermedia. However, if this were the only motive, it is questionable whether a 
joint activity to develop such a model would be justified. 

The need for a jointly developed model arises from the potential of hypermedia as a 
distributed technology. Hypermedia is distributed in at least two ways. First, it has 
proven to be a useful medium for managing the collaboration of teams of 
workers. [CONK87, HALA87) Thus it is often implemented as a distributed application, 
with the resulting need for standards to insure that the various components of such an 
application are consistent with one another and can be maintained in a modular 
fashion. This motive for standardization becomes especially strong when the components 
are not operating in a homogeneous environment. Second, the information that is linked 
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together in a hypermedia system is often distributed in the sense of being of differing 
types and origins. The ability of a hypermedia system to access generic materials 
without expensive recoding and preprocessing will depend on the rapid development and 
broad dissemination of standards for the production and encoding of machine-readable 
information. 

1.4. A Possible High-Level Structure 

The reference model sketched in this paper is described from three perspectives: the 
functional elements of a hypermedia system, implementation concerns, and interface 
issues. We will outline the main elements to be considered in each of these areas, and 
also suggest the applicability of standards to each area. 

2. Elements of Hypermedia 

The two basic elements of a hypermedia system are nodes of information and links 
that join them together. In addition, recent research suggests that the usability of 
hypermedia depends on the disciplined use of structured composites of nodes and links 
as higher-order entities. 

2.1. Nodes 

The nodes of a hyperbase are the units of information that it a:ssembles t Hher and 
among which it provides ready movement. The nodes in a system can be described 
from the perspective of their contents, their typing, and their structure. 

2.1.1. Node Contents 

The very name "hypertext" suggests that virtually every hypermedia system can 
present information in the form of text. Most implementations support some form of 
graphic display as well. Animation, video, and audio are less common but have been 
demonstrated, fBIEB89] suggests generalizing the notion of a node to "any information 
item about which the system can reason." Such a definition permits a node to be 
executable code that is invoked when the link leading to it is traversed, thus leading to 
any conceivable kind of computer operation. In fact, some early antecedents of 
hypertext were menu systems, in which all leaf nodes were of this sort. 
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As long as nodes are treated as atoms, there is no difficulty with such a variety of 
node contents. For many purposes, one must define locations within nodes, either as 
destinations or as origins for a link. The mechanisms for such definition are highly 
dependent on node contents. For example: 

• Because text is one-dimensional, location in a textual node is conveniently 
defined on the basis of characters. 

• In graphical nodes, location is defined two-dimensiona!ly on the basis of 
pixels. 

• Animation and video invite the same pixel-based definition of location as 
does graphics, but there is an additional time dimension. 

• Location in an audio node is most readily defined temporally. 

• In a node consisting of executable code, the instruction counter is a 
reasonable measure of location. If the node processes user input, location can 
be defined in terms of the possible user trajectories through the program. 

2.1.2. Node Typing 

In addition to different contents, nodes may also have different types. Node typing is 
most often important in the context of typed links. For instance, in gIBIS, a Supports 
link can only appear between a node of type Argument and one of type 
Position.[CONKS7] Together with link typing, node typing permits the definition of a 
grammar or rhetoric over a hyperbase, and greatly facilitates user navigation and 
automatic information retrieval. 

2.1.3. Node Structure 

The measures of location defined above for nodes of differing contents are sometimes 
too primitive for convenient use. For example, one can define words or sentences in a 
textual node, buttons or sliders in a graphical node, musical phrases in an audio node, 
or positions in a user trajectory in an executable node, hiding the corresponding 
characters, pixels, time intervals, or instruction counts as implementation details. Then 
links can originate or terminate at these higher-order objects. Consistent definition of 
such higher-order objects and their mappings to lower-order entities offer a good 
opportunity for standardization. 



2.2. Links 

A discussion of links in a hypermedia system requires definition of directionality, 
topology, types, anchors, and modes. 

2.2.1. Link Directionality 

A link is directional if its ends are differentiated in some way from one another. 
Often, the mechanism for traversing a directional link in one direction is different from 
that used in the other direction. For instance, links in Intermedia are not directional. 
The same icon marks both ends of the link, and the same operation traverses it in both 
directions. In HyperTies, links are directional, and the backward direction is usually 
only accessible i{ one has already traversed the link in the forward direction. 
Cognitively, directional links can be a valuable aid to navigation in a 
hyperbase.[PARU89] 

2.2.2. Link Topology 

Current systems typically do not constrain the overall topology that links can form, 
but user navigation depends critically on this topology, and there are strong cognitive 
motives for disallowing arbitrary topologies. [PARU89] The number of possible 
topologies is countably Infinite, but important major classes are linear, hierarchical, 
hypercube, and DAG. 

2.2.3. Link Types 

By defining various types of links (and typically correlating them with typed nodes), 
we can enrich the rhetorical capabilities of a hyperbase, as discussed above under "Node 
Types. 

2.2.4. Link Anchors 

The anchors, or endpoints, of a link are its origin and its destination. The destination 
of a link can either be a node as an atomic unit, or some entity contained within the 
node. In the case of a structured node, this entity will be some element of the 
structure. In the case of an unstructured node, this ei.,tity will be either a point or a 
region defined by whatever measure of location is appropriate to the node's contents. 
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If links are constrained to originate with nodes as atomic units, the resulting 
hyperbase will have a linear topology, which forfeits the more interesting features of 
hypermedia. Thus at least the origins of links are some element within a structured 
node or some location or region within an unstructured node. 

2.2.5. Link Modes 

The simplest form of a link is a fixed connection between two anchors (either nodes or 
entities within nodes). The order of processing a link is usually select-traverse-display. 
Both the form and the processing of a link can be expanded [BIEB89]; a link can be 
virtual (computed at run-time) rather than fixed, and inferencing ( m be added both 
before and after link traversal. Such additional inferencing can be v to implement 
such modes of linking as warm links (in which users can push or puli data over a link) 
and hot links (in which data modified at one end of the link is automatically updated 
on the other end).(CATL89l 

2.3. Composites 

There has been a growing realization among workers in hypermedia that usable 
hyperbases require the ability to manipulate composite entities: entities that are larger 
than, and made up of, individual nodes and links.(HALA87] Such composites can be 
defined either rhetorically or topologically. 

Paths (ZELL89] are a simple example of a topological composite. A bare network of 
links and nodes is well-suited to random browsing, but many applications of 
hypermedia presuppose a basic trajectory through the hyperbase, with the rest of the 
material available as needed. Paths support such applications by giving writers a way to 
define a backbone that readers should follow, and to which they can readily return after 
any digressions. Topologically, the path imposes a linear topology on a much more 
complicated network, thus combining the cognitive advantages of the simpler topology 
with the flexibility of the more complex one. 

Rhetorical composites arr specific constellations of (usually typed) nodes and links 
that form a logical unit for manipulation and navigation. For example, the Toulmin 
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argumentation schema [TOUL69, STRE89| represents an argument as a composite of 
nodes that articulate a claim, its supporting datum, the warrant and backing that make 
the datum relevant to the claim, and any rebuttal. Derivatives of IBIS such as gIBIS 
focus on the basic tree consisting of an issue, various positions on that issue, and the 
arguments for and against ea^h of the positions. [CONK87] 

2A. Element Standardization 

The elements that we have discussed form the ontological foundation of hypermedia, 
suggesting that at least common terminology needs to be defined if standardization of 
any aspect of hyperme(' "* to be possible. This basic ontology is stable enough that the 
outlines of a reference model constructed now will probably be a>.le to accommodate 
new techniques as they are developed, by adding subpoints as appropriate. 

3. Implementation Concerns 

Here we address both architectural and programming issues. 

Layered Architecture 

Architecturally, there is a growing consensus in favor of the value of a layered 
architecture for hypermedia. This approach has been applied both to data 
communications [DAY83] and the control of material handling |PAIIU88|. It not only 
permits modular, maintainable programs, but also facilitates access of a layered system 
by other systems that know the services published at each layer. Thus a layered 
architecture facilitates the development of hyperbases that can interact with one 
another as well as with users* 

At least four layers are useful for a layered hypermedia architecture: data, element, 
inference, and interface. 

Data 

The data layer provides consistent data management for all information in the 

hyperbase^ including both the contents of nodes and the links among nodes. If 

development and browsing of a hyperbase are to be separate processes, this layer 
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manages access permissions to implement read-only networks. In a multiuser 
hyperbase, this layer must support multiple access with appropriate consistency 
management. Many applications will require it to support versioning as well. As 
hypermedia becomes more widely applied, distributed hyperbases will develop that will 
require the data layer to provide distributed data access, and in this case it would 
logically be defined as an RDA application on top of an OSI stack. 

3.1.2. Element 

The element layer provides separate services for managing nodes and links, and 
translates the raw data of the data layer into these atomic elements of hypermedia. The 
value of storing links separately from nodes is becoming evident, and is supported in 
Intermedia and in the link service furnished with Sun's Network Software 
Environment. (PEAR89) Among other benefits, this separation permits users to have 
private sets of links on a document, links that are not visible to other users. The link 
service needs to be able to combine different sets of links over a single document so that 
a user perceives them as forming a single set. Composites can be supported by 
appropriate internal recursion, thus permitting composites of anv degree of nesting to 
be defined, 

3.1.3. Inference 

The inference layer provides at least the ability to traverse a link and retrieve the 
node at the destination. It is also a reasonable place to house services that do inference 
on source and destination nodes in conjunction with link traversal to support 
generalized link traversal as defined in [BIEB89). 

3.1.4. Interface 

The interface layer defines the mechanisms through which the user interacts with the 
hyperbase, and is responsible for displaying the information contained in the node. 
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3.2. Programming Issues 

Object-oriented programming has been an important supporting technology for 
hypermedia, and the development of standards for OOPS will facilitate the interaction 
of various hypermedia systems. 

Some systems, such as HyperTies [COGN89], HyperPAD (BRIG89], and HyperCard 
[\VILL87], build nodes as a stack of different objects. A typical series of such objects 
includes the background, page, field, and button. If nodes are to be accessed through 
multiple systems, standardization of nodt architecture is necessary. 

3.3. Implementation Standardizacion 

Implementation standardization is necessary if hypermedia systems are to interoperate 
(for instance, by accessing the same information). A layered architecture offers promise 
as the reference model for such standardization. Outside of the hypermedia community, 
standardization in object-oriented languages and environments will greatly advance the 
foundation on which hypermedia systems rest. 

4. Interface Issues 

Thef^ are two main categories of interface issues in hypermedia: those concerned with 
constriicting links among nodes, and those concerned with browsing a completed 
network. While many commercial systems include facilities for generating the contents 
of nodes, this process is so application-dependent that it seems to fall outside the scope 
of a reference model. 

4.1. Building Links 

Constructing the links is the most laborious part of populating a hyperbase. Three 
main sets of techniques are commonly used: automatic, mark-up and point-and-shoot. 

4.1.1. Automatic Linking 

Information retrieval (IR) techniques can be used to build networks automatically, for 
example, linking tof^ether ai! (text'iai) nodes containing a specified string of characters. 
Because these techniques are \.\\\'n\y .syntactical and do not "understand" the text, they 



must usually be supplemented by manual review and revision to eliminate spurious 
linkages and to add links that the syntactical scan misses. Natural language techniques 
from AI are beginning to improve the effectiveness of automatic linking, but still are 
not able to "understand" a text and so cannot completely eliminate manual 
editing. [HAYE88) Applied in real time, these techniques are a common way to 
implement virtual links. Standardization of IR techniques is marginally useful for the 
construction of links before run-time, since manual editing can correct any errors, but 
will be useful when these techniques implement virtual links, to insure consistent 
operation of such links across various implementations. 

4.1.2. Mark-Up Linking 

Many PC-based systems require manual mark-up with a text editor to identify link 
sources (and sometimes destinations). The most simple systems simply enclose link 
anchors in reserved brackets, which on execution are interpreted by the display manager 
and result in modified display attributes for the anchor. A more complex mark-up 
system, such as those conforming to (IS086], provides a rich language for specifying 
functional components of a document, such as paragraph and chapter headers. While 
these mark-up languages are not originally designed for hypermedia, they provide a 
useful mechanism for facilitating automatic linking. 

4.1.3. Point-And-Shoot Linking 

The most sophisticated manual linking systems (for example, |PEAR89]) use a point- 
and-shoot interface that permits the user to point at the entities to become anchors and 
thus generate links directly. 

4.2. Browsing 

Browsing issues include the form and manipulation of the display, and navigational 
mechanisms. 
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4.2.1. Display 

One area of active discussion in the hypermedia community is whether information 
should be divided into screen-sized chunks or "cards," or whether the screen should be 
treated as a window that moves over a larger unit of information. There appear to be 
applications where each approach is superior, and both should be accommodated in a 
reference model. 

A number of issues concern the mechanics of manipulating the screen. For instance, 

• In a scrolling system, does one push the window up over the information, or 
does one push the information up past the window? 

• How does one select a link origin? 

• How are active and inactive buttons represented on the screen? 

• What is the correspondence between mouse action and cursor keys? 

The Macintosh has provided a de facto standard for many of these issues. While 
standards are highly desirable (especially for users who must move from one platform to 
another), they are probably best handled in the broader CHI community, not by 
hypermedia specialists. 

4.2.2. Navigational Mechanisms 

Navigational mechanisms are of two main types: maps and path macros. 

4.2.3. Maps 

A map is a single display that shows nodes in abbreviated form (often as Icons) and 
displays the links among them. While intuitive, a map can become cluttered and 
relatively useless for large, complex systems unless it is selective. For instance, a map 
displaying only links of a certain type and their associated nodes, or only composite 
nodes and not their components, will be simpler than a complete map. 
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4«2.4. Path Macros 

A path macro is a composite that is generated in real time by gathering together 
nodes that the user has visited and the jinks along which they were visited, at least up 
to some limiting topology. For instance, a linear topology is commonly used to generate 
a backup stack. A path macro permits the user easily to revisit nodes that have been 
seen and are of particular interest. 

4.3. Interface Standardization 

Interface standardization is desirable, especially ior people who must use more than 
one platform on a regular basis. Much of the desired standardization here will come not 
through work specifically in hypermedia, but through broader forums in CHI. 

5. Conclusion 

Hypermedia, especially in distributed applications, will benefit from standardization. 
To facilitate developing such standards, this paper has suggested a high-level reference 
model that describes the elements, implementation concerns, and interface issues for 
hypermedia. In the area of elements, the greatest need for standardization is in 
vocabulary. Implementation offers a rich possibility for standardization in the 
development of a layered model for hypermedia, and will profit from OOPS 
standardization being pursued elsewhere. Most of the interface standardization that is 
possible at this point is being pursued in the broader CHI community, and (apart from 
navigational devices that are particular to hypermedia) should not be the focal point of 
standardization efforts by the hypermedia community. 
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ABSTRACT 

Realization of the potential for information sharing 
that is inherent in hypertext systems depends on the 
ability to readily exchange data between those sys- 
tems. A format for exchanging link-related data be- 
tween first-order hypertext systems has been de- 
signed, and partially implemented, for the 
Intermedia system. The design is described to the 
individual field level. An example of usage for 
Intermedia link-related information is provided. 
The import, export, and verification utilities cre- 
ated for the interchange format are also described. 

1. INTRODUCTION 

The concept of hypertext has been around for several 
decades and recently we have seen the advent of 
several hypertext applications and systems. These 
applications allow one to create text, graphics, ani- 
mation, video, and a number of other data types and 
proceed to link them together in any manner one sees 
fit. One capability that is still missing is the abil- 
ity to transfer a set of hypertext links and docu- 
ments from one system to another. Such a capability 
would open the door to sharing information and 
bring us one step closer to the mythical 
"hyperspace" or "docu verse" [NelsSl] as Nelson has 
termed it. This paper examines a format for allow- 
ing interchange between hypertext systems. 

2. PURPOSE OF THE INTERCHANGE FORMAf 

Although a wide variety of hypertext/hypermedia 
systems exist today, they can be placed into one of 
two categories. 

A first-order hypertext system manipulates the data 
of 

• documents 



• anchors within documents 

• links between anchors 

• some standard attributes associated with docu- 
ments, anchors, and links. (The standard at- 
tributes include the name, creation time, and cre- 
ator of a document, anchor, and link.) 

Most hypertext systems in existence today arc al 
least first-order hypertext systems [Conk871. 

A second order hypertext system manipulates all 
the information a first-order hypertext system con- 
tains with the additional support for 

• user-defined objects and tyjK'S 

• user-defined attributes and keywords 

• version history for documents, anchors, links, 
and attributes 

There are only a few second-order hyfx'rtext syslemi? 
in existence or development today: Fngelbari's 
NLS/ Augment lEngo68l, Tektronix's Hypertext Ab- 
stract Machine [Camp88], and Nelson's Xanadu 
(NelsSl]. 

Regardless of these categories, all hypertext sys- 
tems need to store this persistent link data in some 
form of database. Since database formats and data- 
base files are inherently nonportable, a portable in- 
terchange format must be designed to facilitate ex- 
changing sets of link-related hypertext data (what 
would be called webs in Intermedia). 

Our interchange format contains the essential link- 
related information for a first-order hypertext sys- 
tem. Any application or system that understands the 
interchange format — what we call here a partici 
pating application or system — can capture all the 
existing hypertext link information as it exists in 
some other participating hypertext system. In con- 
junction with methods for converting and transferring 
document data, this capability makes possible the 
the complete sharing of informaticm betwci^n hyper- 
text systems, largely fulfilling the "docuverse" 
ideal. 

The interchange format is useful for transler^in^ 
data between similar first-orcier hypertext systems. 
It may also be useful for transferring first-order hy- 
pertext information into a second-order hypertext 
system or vice-versa. Suitable defaults could be sup- 
plied for the extra information necessary to trans- 
form first-order information into second-order; when 
transferring second-order information into tirst-ordor, 
the extra information could be ignored. 

It needs to be stressed that the application-spi^cific 
contents and format of hyperloxl documents them- 
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selves are outside the scope of the interchange for- 
mat (which is concerned with the links between the 
documents) and of this paper. Data exchange on the 
document level is approached in other ways, com- 
monly bv adherence to a file format standard, such 
as PICT/TIFF, MacPaint, or RTF. 

3. THE INTERCHANGE FORMAT 
3.1 The Basic Objects 

The information that most hypertext systems deal 
with is basically the same, although the names of 
objects may differ slightly from one system to the 
next. A first-order hypertext system deals with doc- 
uments, anchors, links, and system attributes. These 
objects are stored in a database that the system's 
subordinate applications access in order to provide 
linking functionality. In the interchange format, 
each of these objects corresponds to a separate data 
file that contains the information specific to all oc- 
currences of thai object in the system. The architec- 
ture of these files is described in the next section. 

Documents are the containers for the application- 
specific information in the hypertext system. They 
are built up of two components: the actual applica- 
tion-specific contents of the document (the informa- 
tion the user is interested in working with), and the 
information necessary for the application to render 
its views. The contents could be in the form of text, 
graphics, audio, video, etc. 

Anchors are the locations in documents to which 
links are attached. Some examples of anchors are 
spans of text, graphical objects, audio or video, or 
bitmaps. Anchors are application-specific in that it 
is the application, not the hypertext system's 
database, that must render the anchor (e.g., in doc- 
ument views). 

Links are the connections between anchors. They are 
directional in that they have a source and destina- 
tion anchor. Applications can enforce bidirectional- 
ity or directionality by giving equal precedence to 
both source and destination, or keeping the distinc- 
tion. 

System attributes are predefined attributes that are 
associated with documents, anchors, and links. For 
all first-order hypertext systems- these consist of 
the name, creator, and creation time. Intermedia 
adds the modifier and last modification time to the 
standard system attributes. 

User-defined attributes arc aU;o associated with 
documents, anchors, and links. They allow for flexi- 
ble processing and retrieval of hypertext informa- 
tion. 



3.2 Architecture of the Data Files 

The interchange format consists of five data files for 
recording information aboiU the link-related objects 
in the participating hypertext system, and one file 
for each document in the hypertext system. 

document information 

The document information file contains general in- 
formation dealing with all hypertext documents 
stored in the participating system. This information 
allows an application to gain access to the physical 
location of a document, get the user-defined access 
rights associated with the document, and retrieve 
information about the creator and last modifier of 
the document. A unique identifier for the document 
enables access to anchor information stored in the 
anchor file (described below). 

anchor file 

The anchor file contains information about all an- 
chors in all documents in the hypertext system. This 
information allows an application to know where an 
anchor is located, who created and last modified 
the anchor, and other information that may be 
needed (e.g., to render a view of the anchor). A 
unique identifier for the anchor enables access to 
link information stored in the link file (described 
below). 

link file 

The link file contains information about all links be- 
tween all anchors in the hypertext system. This in- 
formation allows the system to traverse hypertext 
links. The file also contains information about the 
creator and last modifier of the link, A name and 
unique identifier for the link are provided, for con- 
sistency with the other files, and to allow for future 
expansion of functionality, 

attribute definition file 

The attribute definition file contains information 
defining the attributes and keywords used in the 
system. Predefined (system) attributes such as name, 
creator, modifier, creation time and modification 
lime, are not defined in this file. 

attribute file 

The attribute file contains information about which 
objects have which attributes attached to them, as 
well as the values of those attributes. 
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document 

The format of each document file is determined by 
its contents, and the requirements of the participat- 
ing application in which it is used. Formats cur- 
rently employed in Intermedia include "web/' for- 
matted text; structured graphics, timeline, and 
bitmap image. As noted above, the exchange of this 
information between systems is not intended to be 
part of the interchange format. However, several 
fields in the five link-related files are indirectly 
dependent on the existence, system attributes, or con- 
tents of the document files. These are described un- 
der "Implementation." 

3.3 Implementation 

This section describes the iiiterchange format at the 
level of data formatting and field definition. 
Examples illustrating these descriptions are pro- 
vided in Section 4. 

Data Formatting 

In order to make the interchange process as straight- 
forward as possible, the format of the data to be ex- 
changed is kept simple 

Each value is stored in normal ASCII format, so 
that it is easily readable, editable, and portable. 

Each data record in a file is delimited by a car- 
riage-reti:rn /linefeed character pair. Each data 
field in a record is delimited by a tab character. To 
avoid conflicts, the tab character is not permitted in 
document and path names. 

Data values are either strings or numbers. String 
values can be any length. Numeric values are four 
full bytes; the decimal ASCII digits correspond to an 
unsigned 32-bit long word. Certain numeric fields 
store information in terms of the bit patterns in the 
long word. 

All numeric values that denote a time are stored in 
Unix CMT format, which expresses a lime value as 
the number of "ticks" since an established starting 
point (midnight of January 1, 1970). There are about 
31.5 million ticks in a calendar year. 

Valucb for the predefined system eatributes 
(crcationTime, nioilTime, creator, modifier, a n d 
name) are obtained from the operating system via 
Ihe Export uliiily. 

Since soine appliCiUions may reqiure data not specif- 
ically identified in the interchange format, certain 
fields are .ilU)tted for this special purpose. Data in 
these fields is arbitrarily stored iw string format, for 
maximum flexibility, and may need to be converted 



to some other data format for use by a target appli- 
cation. This feature allows for a variable number of 
data values and types to be transferred by the inter- 
change format. 

Site Identification 

The first field of each record contains a cile-specific 
ID. This value is composed of a unique number for 
each s.'.c (or machine) using the interchange format 
and a site-unique number for the database to which 
hypertext data is being imported or exported. The 
combination of a sitelD (with its "site" and 
"database" components) and an object's own unique 
ID allows the object to permanently maintain its 
identity across exchanges of data between sites. 

Some type of assignment of unique numbers for sites 
must be administered in order to implement this fea- 
ture fully. If this were not done, however, the re- 
mainder of the interchange format could still be im- 
plemented independently. 

Another uniqueness scheme might consist of combin- 
ing a 32-bit random number with two 16-bit random 
numbers, which would provide IDs for the site and 
the local database, respectively. This 64-bit number 
should be unique across the domain of all hypertext 
systems. 

Field Definitions 

document infotmation file fields 

sitelD (Numeric) Unique identifier of the 

originating site and database.^ 

doclD (Numeric) Unique identifier of a 

document. Assigned sequentially by 
the DBMS. 

docType (Numeric) Code specifying the 

document's type. 

Allows the system to identify the 
the correct target application for 
application-specific data.^ 



I The first short word of the value stores the site number; the 
second short word stores the database number. The interchange 
format stores the number resulting from reading the two short 
words as a long word. 

^Intermedia supplies codes for its currently supported document 
types (InterWord, InterDraw, etc.). Codes must be standardized 
for participating systems, be these numeric codes or string codes. 
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accessRi^hts 

gronpName 

creationTime 

modTune 

creator 

modifier 

docName 

path 



(Numeric) Number expressing the 
types of access allowed to the doc- 
ument for various groups of users. ' 

(String) The name of the group 
identified in accessRights. 

(Numeric) Time the document was 
created. 

(Numeric) Time the document was 
modified. 

(String) Name of the user that 
created the document. 

(String) Name of the last user to 
modify the document. 

(String) Nam.e of the document. 
Assigned by user when document is 
saved. 

(String) Directory location of the 
document in the Unix tree, relative 
to the application's home direc- 
tory. 



anchor file fields 
sitelD 



anchorlD 



unchorDocID 



creationTime 



(See description for document in- 
formation file.) 

(Numeric) Unique identifier of an 
anchor; assigned sequentially by 
the DBMS. 

(Numeric) Value of doclD, in the 
dociment information file, for the 
document containing the anchor 
identified by anchorlD. 

Allows system to determine the 
document in which the anchor is 
located. 

(Numeric) Time the anchor was 
created. 



^The four bytes of the value, from high to low, correspond to the 
rights granted to: system administrator, owner, group, and world 
(at!) users. The bits of each byte, from high to low, correspond to 
the following rights granted to each of the (our user groups: 
change access rights for the document, write to the document, 
create links in the document, and view the document. The bits are 
set on or off in groups of two. 



modTime 

creator 

modifier 

anchorName 
XAoc 

Y-loc 

Z'loc 
appData 



(Numeric) Time the anchor was 
modified. 

(String) Name of the user that 
created the anchor. 

(String) Name of the last user to 
modify the anchor. 

(String) Name of the anchor. 

(Numeric) X, Y, and Z-axis coordi- 
nates of the anchor, within the 
document specified by docID, 

These allow system to determine 
placement of anchor in document 
window. 

Interpretation of coordinates is 
application-specific. 

(String) Application-specific infor- 
mation dealing with anchors. 

Allows participating application 
to obtain other information re- 
quired. Examples might include 
data needed to render a type of 
window view. 

Values are separated by space 
characters, or other delimiters 
specified by the participating ap- 
plication. 



link file fields 

sitelD 

linkID 

linkType 

srcAnchorlD 



(See description for document in- 
formation file.) 

(Numeric) Unique identifier of a 
link; assigned sequentially by the 
DBMS. 

(Numeric) Code specifying the type 
of relationship between the link's 
two anchors. ^ 

(Numeric) Source anchor of the 
link, as identified by the value of 
anchorlD, in the anchor file. 



intermedia supplies codes for its currently supported document 
link types. Codes must be standardized for participating systems, 
be thesb n'juiuric codes or string codes. 
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destAnchorlD (Numeric) Destination anchor of 
the link, as identified by the value 
of anchorlD, in the anchor file. 



uttValType (Numeric) Code specifying the 
data format of attValiie^ 



creationTime 
modTime 
creator 
modifier 



(Numeric) Time the link was cre- 
ated. 

(Numeric) Time the link was modi- 
fied. 

(String) Name of the user that 
created the Hnk. 

(String) Name of the last user to 
modify the link. 



linkName (String) Name of the link. 

attribute definition file fields 

sitelD (See description for document in- 

formation file.) 

attDeflD (Nurr^eric) Unique identifier of an 
attribute definition; assigned se- 
quentially by the DBMS. 

attDefType (Numeric) Code specifying the at- 
tribute's type. ^ 

General-purpose flag value. One 
potential use is to specify what 
objects the attribute can be at- 
tached to. 

attName (String) Name of the attribute. 

attribute file fields 

silelD (See description for document in- 

formation file.) 

attDeflD (Numeric) Value of attDeflD, in 

the attribute definition file. 

Allows system to look up the at- 
tribute's name and type. 



attValue 



ohjectType 



objSitelD 



objectlD 



This section 
can be used 
data from a 
Intermedia. 



(Variable format) Value of the at- 
tribute. Assigned by the user. 

The next three fields refer to the 
object to which the attribute is at- 
tached: (document, anchor, or 
link). 

(Numeric) Code specifying the ob- 
ject type (document, anchor, or 
Hnk). 3 

(Numeric) Value of sitelD, in the 
corresponding file {document in- 
formation, anchor, or link). 

(Numeric) Value of the object's ID, 
in the corresponding file {document 
information, anchor, or link), 

4. EXAMPLE OF USE 

illustrates how the interchange format 
to create, store, and reuse link-related 
first-order hypertext system, namely 



4.1 . Sample Data In Intermedia 



The Intermedia system is described in a number of 
articles, notably lMeyr86l and |Yank88a]. A public 
release of the software, with full documentation, is 
also currently available through IRIS and through 
the Apple Programmer and Developer's Association 
(APDA). This release (3.0) runs on the Apple 
Macintosh 11, under version 1.1 of A/UX, Apple's 
version of Unix. 

Figure 1 shows the Intermedia desktop environment. 
Two elementary sample documents have been cre- 
ated, one in Intermedia's InterWord format, the 
other in InterDraw. For the clarity of the example, 
these objects have been created in an empty new 
Intermedia database. The folder window (labelled 
"/int/docs/demo") contains the icons representing 
the documents and the Web comprising the links be- 
tween them. The Web View window displays the 
linking structure. The information used in generating 



U participating system supplies .codes for its currently participating system supplies codes for . currently 
supported attribute definition types. Codes must be standardized supported attribute value types. Codes must be standardized for 
for participating systems, be these numeric codes or string codes, participating systems, be these numeric codes or string codes. 

participating system supplies codes for the object types of 
document, anchor, and link. Codes must be standardized for 
participating systems, be these numeric codes or string codes. 
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this Web View is also used to generate the anchor 
and link files of the interchange format. 

An anchor has been created from the word '^block" in 
the InterWord document (indicated by the arrow 
marker over the word). Another anchor .las been 
created from the two rectangles in the InterDraw 
document. Each anchor can be assigned a name; the 
names are not shown here, bat can be viewed and 
edited by the user by means of dialog boxes. 



Thf; current version of Intermedia does not make use 
of attributes and keyxvords, so these are not repre- 
sented in the example. 

At the moment shown in the figure, the link be- 
tween the two anchors has just been followed, from 
the InterWord to the InterDraw document. This is 
shown by the shaded selection handles around the 
rectangles and the shaded link line in the Web 
View. 



^ File Edit Intermedia font Arrange Print 



/int/docs 



/int/docs/demo 




* ^ • .}mmmmmmmm'm mm m m mmmmmmm m m m m m m m m m m p 




f.'ii i^'^ t^. \i 

^ y>> m r. 



.j mmmmm mmmmmm mmmmmmmmmm m mm ^^^ m '^i v.. W. M E I 

I \^:.-^ X^v '^y^ •::<<^ k';? 



u;ordDoc 



[This is a text document, and here is a block. 



dramDoc 



pi; i;;.^: ...i 




uiebDoc: lUeb Uieiu 



2 docum^Ms in w«b 



1 link in w»b 



IS 



SjdrawDoc 

VedMar 1 11 ;53;59 1989 



mwordDoc 



Figurel. Sample docun^enls on the Intermedia desktop. Linking is indicated by the arrow markers in the doc- 
ument windows and the icon-connecting line in the Web View window. 



Intermedia allows users to edit the access rights to 
documents, through the use of the "Document 
Properties" dialog box (simple matrix of sixteen 
check boxes, not shown heie). The ability to edit 
these rights is itself controlled by the rights 
scheme, with the system administrator having ul- 
timate control over a document's access. The rights 
fv)r the two documents in this example are set so as 
to grant the system administrator, document owner, 
and members of the owner's "group" the right to per- 
form all operations on those documents; all other 
users (the "world") can only read them and make 
links in them. 



4.2. Sample Data In the Interchange Format 

This section illustrates how a current version of the 
interchange /ormat stores the first-order hypertext 
link information embodied in the sample Intermedia 
environment in Figure 1. 

After creation of the documents, anchors, and links 
in Intermedia, the link-related information stored in 
the Intermedia database is converted into the inter- 
change format by use of the Export utility, which is 
described in Section 5. 
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The ASCII data values resulting from this conver- 
sion arc shown in the following tables, as they 
would appear when viewed in a text editor (minus 
their field and record delimiter characters). These 
values fully describe the anchor, link, and document- 
properties information contained in the Intermedia 
database for the documents depicted in Figure 1. 

It is arbitrarily assumed that the ID numbers for 
both the current site and converted database arel. 
Using the rule for generating the value of the 
SitclD field noted unckr "Implementation," the fol- 
lowing long word is stored: 



00000000 00000001 
site number 



00000000 000^0001 
database number 



This is displayed as the number 65537. Note that 
this value is the same for every data record in the 
example. 



iiinm 11111111 11111111 00001111 

system owner group world 

Using the rule noted under "Implementation," system 
administrator, owner, and "group" users can perform 
all operations on these documents; "world" users can 
only read them and make links in them. 

The groupName of the group referred to in the ac- 
cessRights is "iris". The creator and modifier fields 
contain the user ID of the author of this example: 
"var". 

The creationTime, expressed in Unix GMT format as 
"604771573," is Wednesday, March 1, 1989, 
4:06:13 PM. 

The docName values of the two documents are those 
shown in the documents' windows in Figure 1. The 
relative path name of the document files is that 
shown in the folder window in Figure 1. 



document info file 






anchor file 






Field 


Value 


Value 


Field 


Value 


Value 


siteld 


65537 


65537 


si^elD 


65537 


65537 


docID 


1 


2 


anchorlD 


1 


2 


docType 


300 


301 


anchorDocID 


1 


2 


accessRights 


4294967055 


4294967055 


creationTime 


604771726 


604771729 


groupName 


iris 


iris 


modTime 


604771726 


604771729 


creationTime 


604771573 


604771642 


creat ^r 


var 


var 


modTime 


604771573 


604771642 


modifier 


var 


var 


creator 
modifier 


var 
var 


var 
var 


anchorName 


Scuirce 
Anchor 


Destination An 
chor 


docName 


wordDoc 


drawDoc 


X^loc 


40 


23 


path 


demo 


demo 


Y'loc 


45 


28 


The documents in the example 
in the Intermedia database, so 
are "1" and "2". 


were the first created 
their doclD numbers 


ZAoc 
appData 


0 
1 


0 

1 203 



The docType uses Intermedia type codes: "300" fur 
InterWord, "301" for InterDraw. 

The accessRights are stored in the bit pattern of the 
value's long word. The value for the documcnls in 
this is written in ASCII as "4294967055," which is 
equivalent to the bits: 



The anchors in the example wore the first created in 
the Intermedia database, so their anchorlD numbers 
are "1" and "2". Their anchorDocID values identify 
the documents they wore created i i: "1" (the 
InterWord document) and "2" (the InterDraw docu- 
ment), respectively. 
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The anchorNames of the anchors arc "Source 
Anchor" and "Destination Anchor". These names are 
informational; Ihey do not affect the directionality 
of the link. ^ 

The X, Y, and Z coordinates for each anchor, and 
the values in the appData field, are interpreted by 
the applications associated with the documents 
identified in the anchorDocID field (InterWord and 
InterDraw), in ways dependent upon the document 
contents. For instance, the data value for anchor 1 
specifies the "anchor type," while the values for 
anchor 2 specify: the two objects the anchor is con- 
nected to, the "view index" of the anchor, and the 
"mark type" (these terms are included for illustra- 
tion; their definition is outside the scope of this pa- 
per). Other link-related data values that do not fit 
elsewhere in the architecture of the interchange 
format can be recorded here in similar fashion. 



link file 
Field 

sitelD 

linklD 

linkType 

src Anchor ID 

destAnchorlD 

creationTime 

modTime 

creator 

modifier 

linkName 



Value 

65537 

1 

2 

1 

2 

604771731 
604771731 
var 
var 

Demo Link 



The link between the anchors in he two documents 
in the example was the firsc created in the 
Intermedia database, so its linkID number is "1". 

The linkType uses Intermedia type codes; "2" de- 
notes a "reference" link. 

The "source" anchor of the link is the one identified 
in the anchor file by the anchorlD of "1"; conse- 
quently "1" is stored here for srcAnchorlD. The 
"destination" anchor of the link is treated in paral- 
lel fashion. Keep in mind that linking in Intermedia 
is bidirectional; the distinction between source and 
destination is maintained for participating systems 
that distinguish between the two. 



The linkName of the link is "Demo Link". This 
value is not presently used in Intermedia, but is 
stored for consistency, in the event it is needed for a 
future version of Intermedia, or for another partici- 
pating system. 

There are a number of other fields in the inter- 
change format that are used this way, providing 
flexibility beyond the bare needs of Intermedia it- 
self. SilelD, and the creationTime, modTime, cre- 
ator, and modifier fields in the anchor and link 
files are examples. 

attribute definition and attribute files 

Although attributes were not included in this 
Intermedia example, their use in this context can be 
illustrated hypothetically. 

For instance, in order to support optional unidirec- 
tional linking, an attribute with the attName of 
"anchorType" could be entered in the attribute defi- 
nition file. Codes for "source" and "destination" 
could then bo entered as values for attValue in the 
attribute file, and attached to particular anchors by 
marking the requisite entries for objectType and objec- 

Another significant use of user-defined attributes is 
for filtering of hypertext information based on key- 
words, which are text strings attached by the user 
to hypertext objects. Keywords serve as flags for as- 
sociating objects with each other. Typical keywords 
might be "Modernism," "Mitosis," "Moon," or 
"Manichcan." Keywords can be implemented by 
defining an attribute named "Keyword" and allow- 
ing users to enter their keywords as values for the 
attribute. 

document files 

The operating system files that store the conto*- .; 
the Intermedia documents shown in Figure 1 lo- 
cated in the directory identified in the path field 
of the interchange format's document information 
file. The names of the document files are stored in 
the docName field of the same interchange formal 
file. 

As noted in Section 2, the application-specific con- 
tents and format of the document files are not con- 
sidered pari of the interchange format. In order to 
support such exchange of document information, 
Intermedia provides various methods for importing 
and exporting document content data. These methods 
include the use of standard file formats, such as RTF 
(for InterWord documents), PICT (for InterDraw doc- 
uments), and TIFF or MacPaint (for InterPix bitmap 
images). 
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4.3. Other Intermedia Usage of the Interchange Format 

An early version of the interchange format has al- 
ready been used in the suite of "Webware" products 
making up part of the public release version of 
Intermedia. The procedure for installing the webs 
for "Exploring the Moon" and the Intermedia 
Tutorials into the Intcu..cdia link server database 
involves running a script that calls the Import util- 
ity which transfers web data in the interchange 
format from a floppy disk to the Intermedia server 
hard disk. The Import utility is described in Section 
5 of this paper. 

This early prototype of the interchange format does 
not support attributes or SitelDs, and the storage for 
anchors is tailored to their treatment within 
Intermedia. 

5. UTILITIES FOR THE INTERCHANGE FORMAT 

A number of utilities have been created for use with 
the interchange format. Some of the utilities process 
the data of the interchange format to validate the 
data, others are used in conjunction with the the 
Intermedia Link Protocol Server ("the link server ) 
to import and export data into the Intermedia 
database. 

5.1. Verify 

The Verify utility checks the consistency of the in- 
terchange format files. It ensures that all documents 
exist for all anchors, and that all anchors exist for 
all links. If keywords are implemented, the utility 
ensures that all documents, anchors, and links exis^t 
for all keywords. A series of hash tables is used 
during the checking process. If any ID is not in the 
hash table, the object being processed is removed 
and placed in an error file, and the user is informed. 

5.2. Export and Import 

The Export and Import utilities are used to extract 
and store, respectively, the data from Intermedia s 
database using the link server. 

Earlier prototypes of these two utilities wore help- 
ful in the conversion of our Intermedia databases 
when we exchanged Ingres for the Intermedia link 
server and its new database system based on C-Tree 
[Fair88]. The utilities have also helped us convert 
databases from one data dictionary format to an- 
other, by running Export with an old-format server, 
and Import with a new-format server. 

The Import utility rends the files of the interchange 
format and calls the import functions ot the hnk 
server to add the data to the database. One param- 
eter to the utility specifies whether to create new 
IDs for each object being added to the database or to 



reuse the existing object IDs. This feature allows us 
to either append data to the end of the database 
(wHh new IDs), or replace the data in the database 
with new data (having the IDs of the existing ob- 
jects) Using the "replace" feature we are able to 
change the locatioi; of the document tree without 
having to change the IDs for the documents. The 
other parameters to the utility specify the Unix file 
system locations for the location to read the inter- 
change format from, the name of the database to 
add the data to, and the new location for the docu- 
ment tree. 

The Export utility cal' - the export functions of the 
link server to dump all data from the database into 
the interchange format. The Verify utility can oe 
run in conjunction with Export, to ensure data in- 
tegrity. The parameters of Export are the same as 
those of Import that deal with Unix file specitica- 
tions, except that Export writes where Import reads, 
and vice versa. 



5.3. Future Developments for Utilities 

The utilities described here have been integrated 
into an application that will potentially be in a 
publicly available version of Intermedia. This ap- 
plication, called Transfer, enables users to select 
document, anchor, and link information to be ex- 
ported by selecting folders and their contents (i.e., 
documents and webs). In order to maintain the in- 
tegrity of all the webs in the selection, documents 
that lie outside the selection in the folder hierar- 
chy but have links or anchors in a selected web, are 
also exported. When exporting, the user can select 
the type of media to export the data to. Hard disk, 
floppy diskette, and tape are currently supported. 
Users can also import previously exported data, 
from the same media types. 

At present, the Transfer application generates data 
in a form of the interchange format described here. 
It is intended that the application be able to gener- 
ate any of a number of other formats as their defini- 
tion and use becomes available. 

There are also plans to create other utilities to en- 
able the conversion of first-order interchange for- 
mats into second-order interchange formats, or from 
prototype first-order interchange formats into pro- 
duction first-order interchange formats, as their 
needs arise. 

6. OTHER INTERCHANGE FORMATS 

At the time 1 developed the interchange format de- 
scribed here, I knew of no other hypertext inter- 
chan^'e formats under development. Many design 
eloml'nts in this interchange format apply specifi- 
cally to the requirements of the Intermedia system. 
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However, tine major conceptual elements arc common 
to most other hypertext interchange formats. 

In this described interchange format, the structure of 
he data file is static, while the the data that fills 
that structure changes dynamically. A format like 
his IS very simple to implement.' However, when 
nterchangmg with other disparate systems this in- 
terchange format becomes very difficult to use 
Converting its structure to a tagged format, like 
SGML, would make it more portable! 

Y^n^oi^ Pos^'^''^' I" convert this format to the 
X.iVl.8M interchange format [Gold89J with rela- 
tively few or no extensions to the HyTime DTD 
However, there are several drawbacks in doing this* 
• cAkTw °^ ^''^ documents in Intermedia are stored 
in bL,ML format, so references to components of the 
documents may be difficult. Second, Mie link-and-an- 
chor database is separate from the document 
database, in order to support linking to non-writable 
media (like CD-ROM disks) and toLppor?Zt!ple 
web mappings over the same document sets. 

The task of converting this data structure to support 
any of the interchange formats [Born89J that conform 
XP'^'''^'' ii^^\m would be possible as 

well This would require adding tags and attributes 
the the existing data elements with some minor re- 
organizations. This is planned as a future project. 

6. SUMMARY 

In this paper a format is documented that shows the 
structure of the data files and the minimum infor^ 
mation necessary to transfer hypertext information 
Irom one first-order hypertext system to another, 
incse data files, when combined with a methodol- 
ogy for converting and transferring the contents of 
application document files, embody an interchange 
format enabling the full exchange of information be- 
tween existing hypertext systems. This was demon- 
strated by the use of the interchange format to 
transfer data into and out of Intermedia. 

it is hoped that this format could be a base of ideas 
m developing an interchange standard for first-order 
hypertext systems thus enabling the sharing of hy- 
pertext information more freely. 

The need remains to establish and publish conven- 
lons for assigning values in the SitcID, docTuve 
linkType, attDepype, attValType, and obiectTypc 
fields to insure compatibility between the systems 
on both ends of a data exchange. 
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Abstract 

This paper provides a strawman reference model that can be used for comparing and rea- 
soning about hypertext /hypermedia systems. It begins with a glossary of hypermedia terms. 
Agreeing on these provides a common vocabulary for developing the reference model. The ref- 
erence model itself is presented in terms of banc features all hypermedia systems have, advanced 
features some hypermedia systems have, and open features that hypermedia systems share with 
other computer systems. These features represent independent dimensions which can be used 
to clA.ssify or compare existing hypermedia .systems and to contrast them with near-miss related 
systems. Based on the features, the architecture of an ideal hypermedia system is described 
that covers existing hypermedia systems. The architecture is modular. A consequence is that 
discussion of standards or a more detailed reference model can focus on one module at a time, 
avoiding movement toward a portmanteau standard. The final section of the paper evaluates 
some areas where consensus and eventual standardization of hypermedia systems is possible 
and would be valuable. An appendix references some standards related to hypermedia sys- 
tems. Another appendix is an initial document log listing references important to hypermedia 
standardization. 
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i INTRODUCTION 

1 Introduction 

The premise of the Hypertext Standardization Workshop is tha* "hypertext and hypermedia tech- 
nologies have reached the point where it makes sense to consider their potential for formal stan- 
(Jaidization" [Workshop Call for Papers). 

This paper provides a strawman reference model that can be used for comparing and reasoning 
about hypertext/hypermedia systems and suggests some areas where enough consensus could occur 
to make eventual standardization passible. 

Section 2 provides an (incomplete) glossary of hypermedia terms. A standard glossary would 
provide a commnn vocabulary for implementors and users of hypermedia systems. This level of 
standard promotes communication among people. 

Section 3 pieseiils a strawman hypervif:dia reference model. Standardizing on a reference model 
should make it possible for people to compare different hypermedia systems and other closely related 
.systems. The section demonstrates this by using the dimensions of a hypermedia system described 
in the reference model to compare several hypermedia systems. The section concludes with an 
ideal, modular architecture for a hypermedia system. 

Operational standards should make it possible for computer systems to share data or interface to 
each other. Section 4 evaluates potential areas, indexed to the reference model, w'.iere operational 
standards fur hypermedia systems may be possible and "vould be valuable. 

Appendix A references some existmg standards related to hypermedia systems. Appendix B is 
a place holder for the document log that a hypermedia systems study group would maintain. 

In fact, overall, this paper can be viewed as the skeleton for a Final Report ol a study group 
yet to be formed recommending whether and what hypermedia standardization is useful. Such a 
report might lead to the formation of an ofTiciai standards body charged with formulating detailed 
hypermedia standards. 

2 Glossary 

The purpose of the glossary is to register terms and how they are used in different hypertext 
systfin.-j. The value of a glos,sary in standardization is to provide a common vocabulary so we all 
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2 GLOSSARY 



understand common terms the same way and can distinguish their various overloaded meanings. 
In addition, glossary terms are important in the development of a reference model (section 3) and 
provide a sim.le a:>proximate way to scope a domain. Hei. we only list some of the more prominent 
terms that need to be defined. 



hypertext 
hypermedia 
browser 
editor 

hypermedia abstract machine 

unique id 

node 

cut-and-paste 
link 

warm link 
hot link 
field 
button 
anchor 

link service 

link protocol 

content 

annotation 

version 

configuration 

web 

network 

guideline 

stack 
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3 REFERENCE MODEL 
card 

background card 

field 

locktext 

script 

scroll 

bookmark 

history 

map 

open architecure 



Here we only comment that some terms like /triAiare heavily overloaded. Other terms like node, 
card, frame are system-specific names for the approximately the same concept. 

3 Reference Model 

A hypermedia reference mode/ is an English description of characteristics that "cover" existing (and 
future) hypermedia systems and provide people with a way to compare them. 

Subsections 3.1, 3.2, and 3.3 sketch basic, advanced, and open features of a prototypical hy- 
permedia system. Each feature represents an independent dimension in which hypermedia systems 
vary. Subsection 3.4 compares how some existing hypermedia systems fit this model and how some 
near-miss systems compare. Sub.sections 3.5 and 3.6 describe an "ideal" architecture for a hyper- 
media system based on the premise that orthogonality implies modularity. If this premise is correct, 
we should expect to concentrate standardization efforts on modules, . .ot on whole systems. 

3.1 Basic Features 

AU hypermedia systems have the following basic characteristics or dimensions through which they 
vary and can be compared. 
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The repnsentatwn dimension provides the primitive medic types or content part, and the com- 
positional data model or structural part, that together are used to represent information in a 
hypermedia system. It is convenient to distinguish these two sorts of representations as separate 
dimensions. 

M'idia Types. A hypertext system must be able to represent text (as well as structure). A 
hypermedia system adds other media types (bitmaps, graphics, sound, video). Specialized media 
editors are needed to permit WYSIWYG editing of media types. Compression of media types 
may be supported; automatic conversion between some media types is supported (e.g. graphic- 
to-bitmap). (Various) standards already exist for representing many of these media types (see 
Appendix A). 

Data Mode). A data motft/ provides the structuring primitives^ of the hypermedia system. To- 
get her, the data model and media data types are used to represent or encode the apphcation-specific 
information content in a hypermedia information system. SpeciaUzed hypermedia interpreters, usu- 
ally with built-in operations, operate on the basic data structures of the data model. 

Data modehng is the most interesting and diverse dimension of hypermedia systems. The com- 
mon invariant that all hypermedia systems share is the notion of navigating through an information 
space by following links. Beyond that, systems vary widely, most implementing some sort of se- 
mantic net with more or less structure. Many hypermedia glossary termc describe system-specific 
data model concepts (e.g.. stack, card, history). Nodes may be inherently unstructured; they may 
have built-in or user programmable types; or they may have attributes, fields, or buttons. Links 
also vary. Most are binary; they may be typed and have attributes; they may anchor at nodes or 
within nodes in a media- or type-specific or apphcation-specific way; or they may be built from 
lower level primitives {anchors and go fo's as in HyperCard). 

While data models differ across different hypermedia systems, they are nearly always built-in to 
today's systems. Later, in section 4 we will considei when and whether mappings betwec-n different 
data models are possible. 

User Interface. The user inlerface provides the capabihty of viewing and editing (WYSI- 
WYG) presentations uf information represented by the data model and media types. 

'tli<.' hype ]i\ hyp^riiicdiii 
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Some hypermedia systems like KMS and HyperCard use the metaphor of a "notecard" and only 
provide fixed (screen-sized) cards with only one card visible at a time. Others like NoteCards use an 
overlapping or tiled window system metaphor of flexible-sized cards with the content and structure 
of a card still tied one-to-one to the display window. Guide provides scrolling and progressive 
disclosure, a step towards providing the user with control of which objects he can see on a screen. 
More generally, a many- many view mapping like that in CMU Andrew covers all of the above cases. 

Persistence. Hypermedia systems all provide some notion of transferring application-specific 
content and structure to and from some persistent storage repository. They vary on the xmit of 
transfer (e.g. Guide document, HyperCard card, Notecards application) and the file or database 
format they use to encode the data represented by the data model. 

3.2 Advanced Features. 

Not all hypermedia systems have the following advanced characteristics. While not mandatory 
(essential, intrinsic, defining), they complement the basic features and are needed for non-trivial 
hypermedia systems. 

Multi-user. Computer-supported cooperative work requires many users to access shared data. 
Some hypermedia systems support this. Sharing by multiple users adds the need for some concur- 
rency control scheme like locking or time-stampirig so users can coordinate access to shared data. 
Data and/or structure may be read-only or modifiable according the access rights of users. Users 
can be granted different access rights at different times or for different purposes. 

Distributed. Even for a single user, hypermedia data may be stored in a central repository or 
be distributed. For instance, content may be on a WORM device and structure may be stored in 
a relational database. 

Uniform Representation^ Many hypermedia systems make a distinction between node and 

contents. This forces the user to **chunk'* the information he wants to represent into some fixed 

grain-size. This can lead to users spending time manually restructuring information. Advanced 

systems provide a more recursive formulation of the data model allowing content to contain nodes 

"This fralure is not independent of the data modeling feature presented earlier but is included here as a major 
dimension lor comparing advanced hypermedia systems. 
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(further structured information). This extra information plus a richer mapping of the more uniform 
data model to the user interface can give the user many views of the same information. Syotems 
like Guide begin to take advantage of this by allowing the user to control which objects are visible 
using progressive disclosure. Irtermedia webs allow two or more views to "share" common nodes. 
Systems like Lotus Agenda allow the user to reorganize the information based on a simple form of 
computed view. The semantics of sharing common objects from difi'erent perspectives can lead to 
dangling pointers and view update problems. 

A different aspect of uniform representation involves the ability to deal with foreign nodes. 
These are nodes whose contents are opaque to the hypermedia system. For at least two reasons, 
uniform representations must generalize to account for these foreign representations. First, not all 
workstation.? can display all information, so video or even graphic information will remain opaque 
on these workstations. Second, hypereditors like KMS or Neptune can bind to non-hypereditors, 
like word processors, that do not understand link protocols (are not themselves uniform; do not 
represent their internal information in a way the hypermedia system can interpret). In this case, 
links typically anchor to whole nodes, which act to "wrap" the foreign editor, or else link anchors 
consist of two parts, a node id and a specifier, often written in a script language that can be 
interpreted by a foreign tool, telling how to offset into the foreign representation. Sun provides an 
application-independent Link Service protocol for standardizing cross-application linking as does 
HP New Wave. 

One last variation in representation is whether hypermedia systems permit users to define the 
scope of objects like figure, section, document, library, video clip, or whether these types are built- 
in. 

Computational Completeness. The computational completeness dimension describes how 
procedural information can be associated with the hypermedia data model to model behavioral 
aspects of the information. 

Procedures can be coupled with data in many ways. Most characteristically, an anchor contains 
a script (proc<'dure in a language specialiiied to the data model as in HyperCard) that is triggered if 
the anchor is activated. Alternatives are demons and rules as in Object Lens, procedures in general 
purpose languages as in NoteCards, assertions, and so on. Since procedural attachments are added 
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dynamically, there must be an interpreter or dynamic compiler. 

This dimension is the hardest to transport across systems, as we discuss in section 4. 

Query. Hypermedia information spaces are often large. Navigation is used for browsing; 
bookmarks for going to known places. Search is used for locating items of interest by their charac- 
teristics, Somp dimensions of search include limiting the scope or order of a search; string search 
versus indexing text; boolean search predicates and their possible nue of indices; user-defined search 
predicates; incremental search; and how the end user can easily specify complex searcher. 

Another dimension involves what to do when search is successful. Alternatives are that the 
.search results in a computed path through the information space or in a new view of the information 
space. Much work from the databfi.se and information retrieval areas is useful here. Query is a very 
rich dimension. 

General-purpose procedural attachments generalize query capabilities and many hypermedia 
systems contain weak or no specialized query facilities. This leaves the burden of specifying complex 
queries to the user via programming. 

Versions, Configurations. Especially for design applications (e.g., documents, software), 
where the life cycle of a design .toeds to be represented, a Change Management data model consisting 
of versions, configurations and transformations is useful for recording change, how a complex object 
is composed of its parts, and how change propagates. 

Portmanteau Features. Subsection 3.4.2 describes near-miss systems closely related to hy- 
permedia systems. We can mine these systems for other characteristics that hypermedia systems 
could have.^ This could overload hypermedia systems with more than their ordinary meaning but 
the exercise is needed to determine how these systems differ from hypermedia systems, 

3.3 Open Features 

Open features are generic and belong to many or all computer systems. They may apply in special 
^For example, few if any hypermedia systems provide parsers to automatically recognize structure in unstructured 
information This is dearly important since a whole hypermedia business could be built around structuring the mass 
of unstructured information Most parser technology is aimed at recognizing already designed languages. The Oxford 
Enghsh Dictionary projec t at University of Waterloo is one place to look for good ideas on the interplay between 
parsing, querying, and computed data models induced by a grammar. 
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ways to hypennedia systems. 

Human Factors. This dimension measures how likeable, usable, and effective a system is for 

the tasks it is designed or needed for. 

Open versus Closed Architectures. Hypermedia systems vary along the dimension of how 
closed or open they are, that is, how extensible they are. Some aspects of openness are: 

• none - browsers 

• editing only - simple authoring systems like Guide 

• user can add node and link types; or can specialize classes the system defines. 

• user can provide procedural attachments 

• system has an application piogram interface 

• system is modular and modules can be replaced 

MouoHthic versus Modular Architecture Today's hypermedia systems are monolithic. An 
alternative is a modular, toolkit architecture in which modules can be added or replaced as the need 
arises. This would mean that design applications could make use of the change management module 
but other applications would not have to pay this cost. If some specialized change management 
is needed, only that module is replaced. The modules themselves m^y be open-^.g., the query 
optimizer could be programmable; the version scheme's notion of deltas could be too; pragmas 
might control how objects are clustering on disk; new kinds of presentations could be added to the 
user interface. A key issue related to modularity involves determining the protocols an existing 
foreign editor must implement to become a friendly hypereditor. It is ir^ore hkely that "the world's 
best editors" can be modified to be hypermedia-conformant than that hypermedia editors wiU come 
to rival these editors. 

Portability and Industrial Strength. The portabihty dimension describes how a system is 
bound to its environment and how easily it can be moved to other environments. It will be more 
portable if 1) it is implemented on de facto standard, industrial strength platorms (Unix, DOS, 
X- Windows, (.'f-f-, SQL, ^tc), 2) it contains alternative, equivalent implementations lor different 
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environments (Open Look versus Presentation Manager), and 3) it can exchange data with many 
existing, popular data exchange formats. 

A hypermedia system is industrial strength if 1) it is debugged and maintained, 2) it scales 
up for large hypermedia bases, 3) performance is acceptable, 4) it has (online) documentation and 
tutorials, 5) it is portable, and/or 6) it is being used in practice. 

Cost, Availability, Service. The world's best designed hypermedia system is worth less if it 
is too costly, unavailable, breaks, and so on. This dimension is a non-technical road block to many 
systems. 

Packaging. This characteristic represents the particular binding of all previous characteristics 
that defines any given system. It is measured by some sort of success metric. 

3.4 Comparison of Existing Systems 

If the reference model just defined is successful, we should be able to compare existing and related 
systems using the dimensions it defines. 

3.4.1 Comparison with Other Hypermedia Systems 
Figure 1 makes this comparison for existing hypermedia systems. 
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Table 1: Comparison of Hypermedia Systems by Basic/Advanced Feature 
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3. -4. 2 Relationship of Hypermedia Systems to Near-Miss Systems 

A hypermedia reference model must also allow comparison with similar systems that are not usually 
classified as hypermedia systems. The big question is, if we factor these systems into their 
characteristic dimensions, then how much overlap would there be between systems.* 

Programming language data structures including object-oriented programming, and AI knowledge 
rtpresentations mdudmg frame-based systems, carry data modehng much further than hypermedia 
systems do today They provide better uniform representations but have no particular support 
for foreign objects. In particular, object-oriented programming languages like C + + , CLOS, and 
Smalltalk have common characteristics including object identity, encapsulation, types or classes, 
and (multiple) inheritance; and they provide procedural attachment. These systems make a strong 
type-instance distinction and some only allow creation of new data types at compile time. 

Persistent programming languages make the data model of the programming language incremen- 
tally persistent, managing secondary storage, concurrency, and recovery. Object-oriented databases 
add sets, queries, and indexing; and also change management and schema evolution to persistent 
languages, but take no particular stand on user interfaces. As such, they generahze relational 
database systems, though implementations of the latter are far more mature. Even more special- 
ized are implementations of information refneva/ systems which store large text bases persistently, 
support indexing, but typically provide no editing, data modeling, and only specialized query lan- 
guages. Geographic mfoi^iation systems store graphical information in often-specialized databases. 

User interface management systems allow simple user interfaces to be built quickly. User inter- 
face toolkits like Stanford Interviews and CMU Andrew provide general purpose interface building 
kits but require programming to put the pieces together. They do not commit to any particular data 
model. In general, object libraries are a m j to package up collections of related objects for reuse in 
building large systems. Structured graphics editors can make use of such systems to build generic 
shapes. Programming language inspectors and class browsers can be viewed as specialized hyper- 
media syster.is for viewing vich representations. DIRED editors, e-mail previewers, CAD schematic 
editors, CASE interfaces, and other semantically specialized graphics editors can browse and edit 
~ A related Tmplementalion question \s, are we building almost the same systems ovei and over without factoring 
out the common modules' 
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many views of domain-specific structured data types. Personal Information SystemsVike Symantec 
GrandView and Lotus Agenda provide many views, including hierarchical views, of simple records 
via cross indexing. 

The kind of objects represented by these systems are usually but not necessarily fine-grained. 
Computer-aided publishing (CAP) systems add primitive objects like text rectangles that may be 
large and may contain embedded objects. Text and document markup languages represent the 
content, of very rich hypertext-like systems often specialized to document preparation but also used 
as the external representations of WYSIWYG document preparation systems like Framemaker. 
Syntax- directed structure editors parse structured text and permit editing, pretty printing, and 
controlled viewing of programs. 

Finally, where Office Documeitt Architecture only distinguishes a structural and a page layout 
architecture for text, graphics, and other static media, technologies like Digital Video Interactive 
specify how to temporally sequence video and sound and introduce compression. 

All of these systems are almost hypermedia systems. Some introduce new features including 
richer data modeling and compression; others seem more like elements of a hypermedia toolkit since 
they overlap hypermedia systems concentrating only on one basic or advanced feature or another. 

3.5 Architecture of an Ideal Hypeniieclia System 

Figure 1 represents an ideal hypermedia system that covers all of the basic and advan. ■ '-tures 
described earher in this section. The key point of the architecture is that it is moK i and 
open. This modularity is based on the observations that the functions the modules perform are 
independent of each other, that is orthogonality implies modularity. The only required modules for 
a basic hypermedia system are the User Interface Toolkit, Domain-specific Data Modeling, Type 
and Object Manager, and Persistent Storage modules. 
Module independence is justified as follows: 

• Media types provide primitive representations for text, bitmaps, audio, video, graphics. 

• The data model represents structure (nodes, relationships, ard content) uniformly. It defines 
what the hypermedia system can represfnit. Specializations can define hypermedia objects 
like card or field or they can define domain-specific objects like transistor or module. 
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Figure 1: Proposerl Ideal Hypermedia System Architecture. 

t Structure and content can be displayed in many ways (or not at all) so the presentation 
layer is independent. This can be implemented with a data model-independent user interface 
toolkit. 

t Whether and how this information is mapped to permanent storage is again independent of 
what is represented, so tue storage system is orthogonal. Implement this with an persistent 
programming language.^ 

• Queries and indexing are related only to whether there are sets, collections, or other navigation 
patht; to iterate through and whether tlu^reis cached information (indexes) that can be used to 
limit the search. Implement this with the open query module of an object-oriented database. 

• Systems may or may not version their structure and content. How they do this, if they do, can 
n'l'nv-md'.p'^ii'.lcnce and Sf orag'?-independ'?nce from representation are similar to the famous 3-tier model of 

databa.ses 
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be studied independently of what they represent^ how it is viewed, or whether it is persistent. 
Implement this with a separate Change Management module. 

• From a single users point of view, whether the system is multi-user or not is largely trans- 
parent; the same goes for whether the system is distributed. 

• Implement the above functions modularly with well-defined inteifaces specified between mod- 
ules. 



3.6 Advantages of this Arcliitecture 

A modular, toolkit architecture like the one described in the previous section has these ad- 
vantages: 

- The architecture could be u.sed to build existing hypermedia systems. In that sense, it 
covers and explains these systems. 

- Related systems are implementing several of the modules needed in ar ideal hypermedia 
system. Work on class libraries, persistent languages and OODBs, and user interface 
toolkits is proceeding in parallel with work on hypermedia systems. 

- Since the architecture is modular, modules can be improved individually which would 
incrementally improve the :>ystem. They can be improved by different research groups or 
vendors. People need not build whole hypermedia systems to experiment with particular 
parts. 

- It will be easier to build the near-miss systems using a modular hypermedia toolkit and 
the extra capabilities they add to the toolkit will likely benefit existing applications. 

- Customized system that only use the modules they need can be constructed. 

- If the modules are orthogonal, then consensus that leads to standardization should con- 
centrate on individual abstractions, not portmanteau standards covering many essen- 
tially independent parts. 

The architecture proposed here is similar to the proposed architecture for Applicatwn Inte- 
gratton Frumeworks being developed by several industrial consortia. These include: USAF 
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WRDC Engineering Information Systems (EIS), Object Management Group (OMG), CAD 
Frameworks Initiative (CFI), and CASE Integrated Systems (CIS). As shown in Figure 2, all 
these efforts provide an object-oriented software backplane architecture into which software 
services are "plugged." This allows new applications that use the common services of the 
framework to be built more quickly and to have a "uniform semantics." Applications are 
simpler to implement since common services are factored out and provided by the framework. 
To date, framework services include common link protocols like Sun's Link Protocol, help 
and tutorial services, debugging services, and change management services, all implemented 
on top of file systems. 
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Figure 2: Hypermedia Modules complement Application Integration Frameworks. 

Missing ingredients from these framework architectures are the modules offered by a modular 
OODB, which would permit shar* at the object grain size instead of the file grain-size and 
querying. Also missing are user interface toolkits and data modeling facilities needed by a 



ERIC 



-237^ 



2;^ 5 



OPERATIONAL STANDARDS: WHERE IS CONSENSUS POSSIBLE-" 



hyppimedia system. The framework view of the world as modular services fits very well with 
the proposed modular hypermedia system architecture. 

3.7 Conclusion 

The reference model presented in this section is incomplete. More work is needed to refine 
it in many places. Nevertheless, we have shown how it provides a way to compare existing 
hypermedia systems along orthogonal dimensions and have indicated that it can be extended 
to relate hypermedia systems to several kinds of near-miss systems. Based on the features 
of hypermedia systems, an ideal architecture for a hypermedia system was presented and 
advantages of this architecture were described and related to the architecture of Applica- 
tion Integration Frameworks. An argument was given for how a modular architecture can 
accelerate progress towards hypermedia standardization, 

4 Operational Standards: Where is Consensus Possible? 

Operational standards provide means for different computer systems to agree to communicate 
or interface or share. Many sorts are possible in the hypermedia area, reflecting the indepen- 
dent dimensions of the reference model presented earlier. This section identifies some areas 
where hypermedia standardization might succeed and be useful. 

4.1 Common Media Type Representations. 

Standards already exist in this area. Appendix A lists some of these. Different media have 
different properties (linear or 2 D in time or space, discrete or continuous, etc). Conversions 
among some media representations are algorithmic but lose information (e.g., structured 
graphics to bitmap, high resolution to low resolution). Often, higher-level structurer! (or 
other nu-'dia) lepresentiitions are represented in media reprcientations. In .nme cases we 
kriov,- how to parse thf^ media algorithrnicoJly to rocogni/;e this information, often we do not. 
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4.2 Common Hypermedia Abstract Machine and Interchange Format. 

The (lata modeling module of a hypermedia system (including the media types) can be rep- 
resented equivalently as 1) an abstract machine which includes a specification of operations 
on data (an interpreter) plus an internal representation of the data it can operate on or 2) 
an external format that encodes the application-specific content of the system for storage or 
transmission. 

The Neptune Hypermedia Abstract Machine (HAM) [7] describes a semantic-net abstract 
machine that includes not only data modeling primitives but also operations for managing 
change and querying. Representation primitives are nodes, attributes, and values. 

By itself, a semantic net data model is so weak that it permits any structural information 
to be encoded. As such, it represents very little unless an interpreter looks at the data (at 
attribute names like type). An ASCII linear representation of a semantic net would have the 
same semantic-less information-bearing properties. 

A semantic net representation could be standardized as could an associated linear represen- 
tation format. The linear format could use Lisp-like parentheses, SGML-like tags, or an easy 
to parse, hard to understand binary format. But this by itself says nothing about whether 
hypermedia systems can exchange Iiypermedia data or cooperate. 

4.3 Common Data Model, 

The heart of a hypermedia system is the information it can represent. Distinctions like text 
rectangle, frame, card, field, button, breadcrumb, and so on provide this information. Differ- 
ent hypermedia systems will be able to exchange information only to the extent that there are 
mappings between their repre: ntation primitives. It may often be reasonable to map a font 
from one system to a different font in another (but not always for all purposes). It may even 
be reasonable for some purposes to set up mappings from KMS frames to Intermedia nodes 
to HyperCard cards to Notecaids. Similarly, Intermedia links can be mapped to HyperCard 
fields with simple scripts containing ''go to"?. If N hypermedia systems represent the same 
object then a mapping to an intermediate form does no* lose information ?u\d can be useful. 
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We often need to perform mappings between different system's representations: if conversion 
from one system to another is required, we try to map as much information as is useful, 
In most instances, some amount of conversion can happen algorithmically. It is not too 
interesting that specific content can be moved between hypermedia systems with application- 
specific mappings. The interesting case involves whether applicatior.-independent conversion 
routine between two systems are useful or possible. 

In general, mappings can be one-way (no inverse); they can be non-unique; and they can lose 
Information. All these cases happen in important hypermedia system. Because of the power 
of scripts, the inverse of mapping Intermedia links to HyperCard fields and go- to scripts is 
not unique. HyperCard foreground and background cards can be mapped to KMS frames 
but the "inheritance" is lost. Guide's variable-sized text nodes would need to be mapped 
to several KMS fixed-size frames. Structured graphics imported into Guide is converted to 
bitmaps, losing the structure. And so on. 

Even when a mapping is established, data exchange between different hypermedia systems 
will often not preserve the look and fedoi different hypermedia systems. Thus a Guide node 
may map to a HyperCard text field but the progressive-disdosure-in-context look and feel of 
Guide outline processing will be lost. 

With all these caveats, it is often useful to build generic conve, programs. PC and 
Macintish application commonly convert data to their own intf /»mats, often losing some 
information. References [13-15] describe systems that explore the problems associated with 
mapping between different document representations. The Berkeley Vortex system explores 
how to maintain an incremental, multiple representation mapping between a WYSIWYG 
editor and a markup language representation. 

While it is fruitful to try to define intermediate forms like the Dexter Hypermedia Interchange 
Format [6] that permit mapping information between today's intermediate forms (since it 
points out exactly where the mappings cause problems), it seems unlikely that the behavioral, 
script component so dominant in HyperCard can be captured without duplicating the entire 
HyperCard script interpreter in some related Hypermedia system. It may be better to consider 
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whether richer, more uniform representations are better than cards and slots. 

4.4 Common Object Libraries. 

The X Consortia is considering a standard C++ interface to X- Windows. [13] describes a 
portable Office Document Architecture toolkit consisting of C subroutines associated with 
the CMU Andrew Toolkit. Stanford Interviews is a C++ class library implementing a user 
interface toolkit. It seems likely that we could standardize on C++ libraries in these sorts 
of area. Such libraries could implement cards, buttons, and so on but could also uniforniily 
implement CAD transisters and layout structures. 

4.5 Standard OODBs. 

X3/SPARC/DBSSG has recently announced the OODB Task Group which is chartered to 
assess the potential for standardization in the OODB area. This is especially interesting since 
many hypermedia researchers look forward to using OODBs to help implement large, shared 
hypermedia systems. This effort itself may involve several standards; how to seamlessly 
interface OODBs to various languages to provide persistence and sharing, and how to map 
data between languages (like Sun's XDR) to allow cross-language sharing. 

4.6 Abstract Machines for Querying and Change Management 

As mentioned, Neptune HAM defined not only data modeling primitives but also operations 
for managing change and querying. These are independent dimensions and should be treated 
as separate abstraction machines. The query engine should define how a set-oriented query 
engine attaches to a representation, indexes it, and permits powerful queries. A change man- 
agement abstract machine defines operations on versions, configurations, and transformations. 
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4.7 Link Protocol 

Sun's Link Service and HP New Wave both define a protocol applications can use to set up 
various kinds of cross-application links. HP New Wave appears more powerful in that it would 
permit cross-application (key-board) macros based on the link service and implement common 
system. wide protocols for accessing help, tutorials, and other common services. This is an area 
of potential standardization being covered by the several Frameworks consortia mentioned in 
section 3.6. 

A Hypermedia Standardization Group would complement the Frameworks effort if it concen- 
trated on making some of the services described above available. 

5 Conclusions 

This paper has provided a reference model for comparing hypermedia systems and an archi- 
tecture that isolates design decisions to modules. The implication is that we can consider 
separable subsystems in isolation, then combine the parts to make a whole hypermedia sys- 
tem. 

Based on this analysis, several areas where consensus is possible were isolated including: 
media representations, data model, interchange formats, class libraries (for media types, data 
modeling types, and domain specific types like CAD), user interface toolkit class libraries, a 
standard protocol for linking, standards for persistent languages, and abstract machines for 
queries and change management. 

Some of these standards exist, some are being pursued by other official or de facto standards 
bodies, and some are new possibilities. While it seems too early to consider standardizing 
today's hypermedia systems with their several limitations, the effort toward building consen- 
sus is helping us to understand these systems better and to identify potential areas where 
standards can help. 
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6 Appendix A: Related Standards and Common Formats 

This section lists some common external representations of information used fur various pur- 
poses. It is included since it represents a beginning of a section on related standards. It also 
demonstrated some of the breadth of kinds of objects that hypermedia systems will need to 
represent. 

communication protocols 

SCSI Small Computer Systems Interface 
external representations for data structures 

XDR Sun's external data representation 
device-independent procedural page/screen description formats 

DVI — for TEX 

ditroff for troff 

imPRESS(TM) document for printing on an IMAGEN laser printer 
EPS Encapsulated Postscript generated by Adobe Illustrator(TM) , 
Cricketdraw(TM) , Aldus Freehand(TM) on the Macintosh and Media 
Logic's Artisan(TM) on the Sun; also Display Postscript and 
color versions 

media type interchange formats (specific ''document contents'' like 
characters, raster graphics, geometric graphics, sound, video, 
etc). Note: Several of these representations represent structure 
and content, 
ASCII - text 

DIF Document Interchange Format used to interchange text and 

formatting instructions across a wide variety of wordprocessors 
and publishing systems 

troff - the standard Unix text processing utility 

DCA IBM's Revisable Form Text Document Content Architecture, Many 
popular word processors can store documents in this format 
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^including IBM Di splaywriler (R) , WordPerfect (R) , Wang(R) , 
MultiMatG(TM) , Wordstar2000CR) , Sainna IV (TM) , Of f iceWriter (R) , 
and Microsoft Word^R) can store documents in this format. Does 
not support graphics. 

Scribe 

Tex, LaTex popular text furraatting language, weak on non- textual 

objects, primitives for tables 
MIF Framemaker's Maker Interchange Format 

Interleaf, Microsoft Word, HyperCard, WordStar, Ventura, ... many 

products provide a way to save and restore their state. 
EDA/VGA/CGA bitmap screen sizes/resolutions on different PCs 
X3H3 GKSM Graphical Kernal System Metafile (polyline, polymarker, 
text, fill area, cell array, generalized drawing primitive) 
(A second metafile standard provides a way to encode a sq 
sequence of GKS commands. The description of the objects, not 
the image is save''; . 

PHIGS " 

GIF graphic interchange format 
ISO Computer Graphics Metafile 

PICT Macintosh standard graphics description format 
pic a language for typesetting graphics 

HPGL a popular plotter output format used by many workstation CAD 

progrc^ms like AutoCAD 
TGES a standard graphics interchange format used by many workstation 

CAD prograins 

MacDraw - Macintosh (TM) MacDraw f iles--QuickDraw--toolbox ROM routines 
NTSC U.S. etc television format standard for production and 

trantiminsion : Kuropo uses PAL; HDTV and ACIV are next 

generat ion 
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SMPTE — Society of Motion Picture and Television Engineers^-t ime code 
for syncing audio, video, film 
document/audio-video representation and interchange formats 

SGML ANSI/ISO Standard Generalized Markup Language. Uses markups 

(tags) to create an indirections between intent and rendering. 
Does not support graphics. 
ODIF Office Document Interchange Format. ODA distinguishes a logical 

hierarchy and a layout hierarchy 
CD-I Compact Disk Interactive, compression/decompression formats 
DVI ~- Digital Video Interative, Text, audio, video stills, and video 
motion, at various resolutions, mixed, 
compression/decompression formats 
cad-specific interchange formats 

EDIF Electronic Data Interchange Format 
VHDL VHSIC Hardware Description Language 
CIF Caltech Interchange Format 
product interchange format 

PDES " Product Data Exchange Specification 
EDI Electronic Data Interchange 
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The document log lists bibliographies, conference procoedings, key papers, and other docu- 
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[1] Jakob Nielsen, ^4iypertext Bibliography/' Hypermedia, Taylor t^raham (ed). 1:1, 1989. 
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[2] Proceeding of the ACM SIGPLAN; SIGUA Syuipo.siiun on Text Manipulation. P^Mtland, 
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Paul Kahn, 

Institute for Research in Information and Scholarship 
Brown University, Box 1946 
Providence RI 02912 

Since the last lime wc compiled this bibliography in November 1987 for the Hypertext '87 Workshop, 
there has been an explosion of hypertext literature. When we started the bibliography project at IRIS in 
1983, we thought it would be possible to collect every book, conference paper and journal article on the 
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